1ZSTD(1) User Commands ZSTD(1)
2
3
4
6 zstd - zstd, zstdmt, unzstd, zstdcat - Compress or decompress .zst
7 files
8
10 zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12 zstdmt is equivalent to zstd -T0
13
14 unzstd is equivalent to zstd -d
15
16 zstdcat is equivalent to zstd -dcf
17
19 zstd is a fast lossless compression algorithm and data compression
20 tool, with command line syntax similar to gzip (1) and xz (1). It is
21 based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22 offers highly configurable compression speed, with fast modes at > 200
23 MB/s per core, and strong modes nearing lzma compression ratios. It
24 also features a very fast decoder, with speeds > 500 MB/s per core.
25
26 zstd command line syntax is generally similar to gzip, but features the
27 following differences :
28
29 · Source files are preserved by default. It´s possible to remove them
30 automatically by using the --rm command.
31
32 · When compressing a single file, zstd displays progress notifica‐
33 tions and result summary by default. Use -q to turn them off.
34
35 · zstd does not accept input from console, but it properly accepts
36 stdin when it´s not the console.
37
38 · zstd displays a short help page when command line is an error. Use
39 -q to turn it off.
40
41
42
43 zstd compresses or decompresses each file according to the selected
44 operation mode. If no files are given or file is -, zstd reads from
45 standard input and writes the processed data to standard output. zstd
46 will refuse to write compressed data to standard output if it is a ter‐
47 minal : it will display an error message and skip the file. Similarly,
48 zstd will refuse to read compressed data from standard input if it is a
49 terminal.
50
51 Unless --stdout or -o is specified, files are written to a new file
52 whose name is derived from the source file name:
53
54 · When compressing, the suffix .zst is appended to the source file‐
55 name to get the target filename.
56
57 · When decompressing, the .zst suffix is removed from the source
58 filename to get the target filename
59
60
61
62 Concatenation with .zst files
63 It is possible to concatenate .zst files as is. zstd will decompress
64 such files as if they were a single .zst file.
65
67 Integer suffixes and special values
68 In most places where an integer argument is expected, an optional suf‐
69 fix is supported to easily indicate large integers. There must be no
70 space between the integer and the suffix.
71
72 KiB Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73 as synonyms for KiB.
74
75 MiB Multiply the integer by 1,048,576 (2^20). Mi, M, and MB are
76 accepted as synonyms for MiB.
77
78 Operation mode
79 If multiple operation mode options are given, the last one takes
80 effect.
81
82 -z, --compress
83 Compress. This is the default operation mode when no operation
84 mode option is specified and no other operation mode is implied
85 from the command name (for example, unzstd implies --decom‐
86 press).
87
88 -d, --decompress, --uncompress
89 Decompress.
90
91 -t, --test
92 Test the integrity of compressed files. This option is equiva‐
93 lent to --decompress --stdout except that the decompressed data
94 is discarded instead of being written to standard output. No
95 files are created or removed.
96
97 -b# Benchmark file(s) using compression level #
98
99 --train FILEs
100 Use FILEs as a training set to create a dictionary. The training
101 set should contain a lot of small files (> 100).
102
103 -l, --list
104 Display information related to a zstd compressed file, such as
105 size, ratio, and checksum. Some of these fields may not be
106 available. This command can be augmented with the -v modifier.
107
108 Operation modifiers
109 · -#: # compression level [1-19] (default: 3)
110
111 · --ultra: unlocks high compression levels 20+ (maximum 22), using a
112 lot more memory. Note that decompression will also require more
113 memory when using these levels.
114
115 · --fast[=#]: switch to ultra-fast compression levels. If =# is not
116 present, it defaults to 1. The higher the value, the faster the
117 compression speed, at the cost of some compression ratio. This set‐
118 ting overwrites compression level if one was set previously. Simi‐
119 larly, if a compression level is set after --fast, it overrides it.
120
121 · -T#, --threads=#: Compress using # working threads (default: 1). If
122 # is 0, attempt to detect and use the number of physical CPU cores.
123 In all cases, the nb of threads is capped to ZSTDMT_NBWORK‐
124 ERS_MAX==200. This modifier does nothing if zstd is compiled with‐
125 out multithread support.
126
127 · --single-thread: Does not spawn a thread for compression, use a
128 single thread for both I/O and compression. In this mode, compres‐
129 sion is serialized with I/O, which is slightly slower. (This is
130 different from -T1, which spawns 1 compression thread in parallel
131 of I/O). This mode is the only one available when multithread sup‐
132 port is disabled. Single-thread mode features lower memory usage.
133 Final compressed result is slightly different from -T1.
134
135 · --adapt[=min=#,max=#] : zstd will dynamically adapt compression
136 level to perceived I/O conditions. Compression level adaptation can
137 be observed live by using command -v. Adaptation can be constrained
138 between supplied min and max levels. The feature works when com‐
139 bined with multi-threading and --long mode. It does not work with
140 --single-thread. It sets window size to 8 MB by default (can be
141 changed manually, see wlog). Due to the chaotic nature of dynamic
142 adaptation, compressed result is not reproducible. note : at the
143 time of this writing, --adapt can remain stuck at low speed when
144 combined with multiple worker threads (>=2).
145
146 · --long[=#]: enables long distance matching with # windowLog, if not
147 # is not present it defaults to 27. This increases the window size
148 (windowLog) and memory usage for both the compressor and decompres‐
149 sor. This setting is designed to improve the compression ratio for
150 files with long matches at a large distance.
151
152 Note: If windowLog is set to larger than 27, --long=windowLog or
153 --memory=windowSize needs to be passed to the decompressor.
154
155 · -D DICT: use DICT as Dictionary to compress or decompress FILE(s)
156
157 · --patch-from FILE: Specify the file to be used as a reference point
158 for zstd´s diff engine. This is effectively dictionary compression
159 with some convenient parameter selection, namely that windowSize >
160 srcSize.
161
162 Note: cannot use both this and -D together Note: --long mode will
163 be automatically activated if chainLog < fileLog (fileLog being the
164 windowLog required to cover the whole file). You can also manually
165 force it. Node: for all levels, you can use --patch-from in --sin‐
166 gle-thread mode to improve compression ratio at the cost of speed
167 Note: for level 19, you can get increased compression ratio at the
168 cost of speed by specifying --zstd=targetLength= to be something
169 large (i.e 4096), and by setting a large --zstd=chainLog=
170
171 · --rsyncable : zstd will periodically synchronize the compression
172 state to make the compressed file more rsync-friendly. There is a
173 negligible impact to compression ratio, and the faster compression
174 levels will see a small compression speed hit. This feature does
175 not work with --single-thread. You probably don´t want to use it
176 with long range mode, since it will decrease the effectiveness of
177 the synchronization points, but your milage may vary.
178
179 · -C, --[no-]check: add integrity check computed from uncompressed
180 data (default: enabled)
181
182 · --[no-]content-size: enable / disable whether or not the original
183 size of the file is placed in the header of the compressed file.
184 The default option is --content-size (meaning that the original
185 size will be placed in the header).
186
187 · --no-dictID: do not store dictionary ID within frame header (dic‐
188 tionary compression). The decoder will have to rely on implicit
189 knowledge about which dictionary to use, it won´t be able to check
190 if it´s correct.
191
192 · -M#, --memory=#: Set a memory usage limit. By default, Zstandard
193 uses 128 MB for decompression as the maximum amount of memory the
194 decompressor is allowed to use, but you can override this manually
195 if need be in either direction (ie. you can increase or decrease
196 it).
197
198 This is also used during compression when using with --patch-from=.
199 In this case, this parameter overrides that maximum size allowed
200 for a dictionary. (128 MB).
201
202 · --stream-size=# : Sets the pledged source size of input coming from
203 a stream. This value must be exact, as it will be included in the
204 produced frame header. Incorrect stream sizes will cause an error.
205 This information will be used to better optimize compression param‐
206 eters, resulting in better and potentially faster compression,
207 especially for smaller source sizes.
208
209 · --size-hint=#: When handling input from a stream, zstd must guess
210 how large the source size will be when optimizing compression
211 parameters. If the stream size is relatively small, this guess may
212 be a poor one, resulting in a higher compression ratio than
213 expected. This feature allows for controlling the guess when
214 needed. Exact guesses result in better compression ratios. Overes‐
215 timates result in slightly degraded compression ratios, while
216 underestimates may result in significant degradation.
217
218 · -o FILE: save result into FILE
219
220 · -f, --force: overwrite output without prompting, and (de)compress
221 symbolic links
222
223 · -c, --stdout: force write to standard output, even if it is the
224 console
225
226 · --[no-]sparse: enable / disable sparse FS support, to make files
227 with many zeroes smaller on disk. Creating sparse files may save
228 disk space and speed up decompression by reducing the amount of
229 disk I/O. default: enabled when output is into a file, and disabled
230 when output is stdout. This setting overrides default and can force
231 sparse mode over stdout.
232
233 · --rm: remove source file(s) after successful compression or decom‐
234 pression. If used in combination with -o, will trigger a confirma‐
235 tion prompt (which can be silenced with -f), as this is a destruc‐
236 tive operation.
237
238 · -k, --keep: keep source file(s) after successful compression or
239 decompression. This is the default behavior.
240
241 · -r: operate recursively on directories
242
243 · --filelist FILE read a list of files to process as content from
244 FILE. Format is compatible with ls output, with one file per line.
245
246 · --output-dir-flat DIR: resulting files are stored into target DIR
247 directory, instead of same directory as origin file. Be aware that
248 this command can introduce name collision issues, if multiple
249 files, from different directories, end up having the same name.
250 Collision resolution ensures first file with a given name will be
251 present in DIR, while in combination with -f, the last file will be
252 present instead.
253
254 · --output-dir-mirror DIR: similar to --output-dir-flat, the output
255 files are stored underneath target DIR directory, but this option
256 will replicate input directory hierarchy into output DIR.
257
258 If input directory contains "..", the files in this directory will
259 be ignored. If input directory is an absolute directory (i.e.
260 "/var/tmp/abc"), it will be stored into the "out‐
261 put-dir/var/tmp/abc". If there are multiple input files or directo‐
262 ries, name collision resolution will follow the same rules as
263 --output-dir-flat.
264
265 · --format=FORMAT: compress and decompress in other formats. If com‐
266 piled with support, zstd can compress to or decompress from other
267 compression algorithm formats. Possibly available options are zstd,
268 gzip, xz, lzma, and lz4. If no such format is provided, zstd is the
269 default.
270
271 · -h/-H, --help: display help/long help and exit
272
273 · -V, --version: display version number and exit. Advanced : -vV also
274 displays supported formats. -vvV also displays POSIX support. -q
275 will only display the version number, suitable for machine reading.
276
277 · -v, --verbose: verbose mode, display more information
278
279 · -q, --quiet: suppress warnings, interactivity, and notifications.
280 specify twice to suppress errors too.
281
282 · --no-progress: do not display the progress bar, but keep all other
283 messages.
284
285 · --show-default-cparams: Shows the default compression parameters
286 that will be used for a particular src file. If the provided src
287 file is not a regular file (eg. named pipe), the cli will just out‐
288 put the default parameters. That is, the parameters that are used
289 when the src size is unknown.
290
291 · --: All arguments after -- are treated as files
292
293
295 Additional options for the pzstd utility
296
297 -p, --processes
298 number of threads to use for (de)compression (default:4)
299
300
301
302
303
304 Restricted usage of Environment Variables
305 Using environment variables to set parameters has security implica‐
306 tions. Therefore, this avenue is intentionally restricted. Only
307 ZSTD_CLEVEL and ZSTD_NBTHREADS are currently supported. They set the
308 compression level and number of threads to use during compression,
309 respectively.
310
311 ZSTD_CLEVEL can be used to set the level between 1 and 19 (the "normal"
312 range). If the value of ZSTD_CLEVEL is not a valid integer, it will be
313 ignored with a warning message. ZSTD_CLEVEL just replaces the default
314 compression level (3).
315
316 ZSTD_NBTHREADS can be used to set the number of threads zstd will
317 attempt to use during compression. If the value of ZSTD_NBTHREADS is
318 not a valid unsigned integer, it will be ignored with a warning mes‐
319 sage. ´ZSTD_NBTHREADShas a default value of (1), and is capped at ZST‐
320 DMT_NBWORKERS_MAX==200.zstd` must be compiled with multithread support
321 for this to have any effect.
322
323 They can both be overridden by corresponding command line arguments: -#
324 for compression level and -T# for number of compression threads.
325
327 zstd offers dictionary compression, which greatly improves efficiency
328 on small files and messages. It´s possible to train zstd with a set of
329 samples, the result of which is saved into a file called a dictionary.
330 Then during compression and decompression, reference the same dictio‐
331 nary, using command -D dictionaryFileName. Compression of small files
332 similar to the sample set will be greatly improved.
333
334 --train FILEs
335 Use FILEs as training set to create a dictionary. The training
336 set should contain a lot of small files (> 100), and weight typ‐
337 ically 100x the target dictionary size (for example, 10 MB for a
338 100 KB dictionary).
339
340 Supports multithreading if zstd is compiled with threading sup‐
341 port. Additional parameters can be specified with --train-fast‐
342 cover. The legacy dictionary builder can be accessed with
343 --train-legacy. The cover dictionary builder can be accessed
344 with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
345
346 -o file
347 Dictionary saved into file (default name: dictionary).
348
349 --maxdict=#
350 Limit dictionary to specified size (default: 112640).
351
352 -# Use # compression level during training (optional). Will gener‐
353 ate statistics more tuned for selected compression level,
354 resulting in a small compression ratio improvement for this
355 level.
356
357 -B# Split input files in blocks of size # (default: no split)
358
359 --dictID=#
360 A dictionary ID is a locally unique ID that a decoder can use to
361 verify it is using the right dictionary. By default, zstd will
362 create a 4-bytes random number ID. It´s possible to give a pre‐
363 cise number instead. Short numbers have an advantage : an ID <
364 256 will only need 1 byte in the compressed frame header, and an
365 ID < 65536 will only need 2 bytes. This compares favorably to 4
366 bytes default. However, it´s up to the dictionary manager to not
367 assign twice the same ID to 2 different dictionaries.
368
369 --train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]
370 Select parameters for the default dictionary builder algorithm
371 named cover. If d is not specified, then it tries d = 6 and d =
372 8. If k is not specified, then it tries steps values in the
373 range [50, 2000]. If steps is not specified, then the default
374 value of 40 is used. If split is not specified or split <= 0,
375 then the default value of 100 is used. Requires that d <= k. If
376 shrink flag is not used, then the default value for shrinkDict
377 of 0 is used. If shrink is not specified, then the default value
378 for shrinkDictMaxRegression of 1 is used.
379
380 Selects segments of size k with highest score to put in the dic‐
381 tionary. The score of a segment is computed by the sum of the
382 frequencies of all the subsegments of size d. Generally d should
383 be in the range [6, 8], occasionally up to 16, but the algorithm
384 will run faster with d <= 8. Good values for k vary widely based
385 on the input data, but a safe range is [2 * d, 2000]. If split
386 is 100, all input samples are used for both training and testing
387 to find optimal d and k to build dictionary. Supports multi‐
388 threading if zstd is compiled with threading support. Having
389 shrink enabled takes a truncated dictionary of minimum size and
390 doubles in size until compression ratio of the truncated dictio‐
391 nary is at most shrinkDictMaxRegression% worse than the compres‐
392 sion ratio of the largest dictionary.
393
394 Examples:
395
396 zstd --train-cover FILEs
397
398 zstd --train-cover=k=50,d=8 FILEs
399
400 zstd --train-cover=d=8,steps=500 FILEs
401
402 zstd --train-cover=k=50 FILEs
403
404 zstd --train-cover=k=50,split=60 FILEs
405
406 zstd --train-cover=shrink FILEs
407
408 zstd --train-cover=shrink=2 FILEs
409
410 --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
411 Same as cover but with extra parameters f and accel and differ‐
412 ent default value of split If split is not specified, then it
413 tries split = 75. If f is not specified, then it tries f = 20.
414 Requires that 0 < f < 32. If accel is not specified, then it
415 tries accel = 1. Requires that 0 < accel <= 10. Requires that d
416 = 6 or d = 8.
417
418 f is log of size of array that keeps track of frequency of sub‐
419 segments of size d. The subsegment is hashed to an index in the
420 range [0,2^f - 1]. It is possible that 2 different subsegments
421 are hashed to the same index, and they are considered as the
422 same subsegment when computing frequency. Using a higher f
423 reduces collision but takes longer.
424
425 Examples:
426
427 zstd --train-fastcover FILEs
428
429 zstd --train-fastcover=d=8,f=15,accel=2 FILEs
430
431 --train-legacy[=selectivity=#]
432 Use legacy dictionary builder algorithm with the given dictio‐
433 nary selectivity (default: 9). The smaller the selectivity
434 value, the denser the dictionary, improving its efficiency but
435 reducing its possible maximum size. --train-legacy=s=# is also
436 accepted.
437
438 Examples:
439
440 zstd --train-legacy FILEs
441
442 zstd --train-legacy=selectivity=8 FILEs
443
445 -b# benchmark file(s) using compression level #
446
447 -e# benchmark file(s) using multiple compression levels, from -b# to
448 -e# (inclusive)
449
450 -i# minimum evaluation time, in seconds (default: 3s), benchmark
451 mode only
452
453 -B#, --block-size=#
454 cut file(s) into independent blocks of size # (default: no
455 block)
456
457 --priority=rt
458 set process priority to real-time
459
460 Output Format: CompressionLevel#Filename : IntputSize -> OutputSize
461 (CompressionRatio), CompressionSpeed, DecompressionSpeed
462
463 Methodology: For both compression and decompression speed, the entire
464 input is compressed/decompressed in-memory to measure speed. A run
465 lasts at least 1 sec, so when files are small, they are com‐
466 pressed/decompressed several times per run, in order to improve mea‐
467 surement accuracy.
468
470 --zstd[=options]:
471 zstd provides 22 predefined compression levels. The selected or default
472 predefined compression level can be changed with advanced compression
473 options. The options are provided as a comma-separated list. You may
474 specify only the options you want to change and the rest will be taken
475 from the selected or default compression level. The list of available
476 options:
477
478 strategy=strat, strat=strat
479 Specify a strategy used by a match finder.
480
481 There are 9 strategies numbered from 1 to 9, from faster to
482 stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
483 5=ZSTD_lazy2, 6=ZSTD_btlazy2, 7=ZSTD_btopt, 8=ZSTD_btultra,
484 9=ZSTD_btultra2.
485
486 windowLog=wlog, wlog=wlog
487 Specify the maximum number of bits for a match distance.
488
489 The higher number of increases the chance to find a match which
490 usually improves compression ratio. It also increases memory
491 requirements for the compressor and decompressor. The minimum
492 wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
493 forms and 31 (2 GiB) on 64-bit platforms.
494
495 Note: If windowLog is set to larger than 27, --long=windowLog or
496 --memory=windowSize needs to be passed to the decompressor.
497
498 hashLog=hlog, hlog=hlog
499 Specify the maximum number of bits for a hash table.
500
501 Bigger hash tables cause less collisions which usually makes
502 compression faster, but requires more memory during compression.
503
504 The minimum hlog is 6 (64 B) and the maximum is 30 (1 GiB).
505
506 chainLog=clog, clog=clog
507 Specify the maximum number of bits for a hash chain or a binary
508 tree.
509
510 Higher numbers of bits increases the chance to find a match
511 which usually improves compression ratio. It also slows down
512 compression speed and increases memory requirements for compres‐
513 sion. This option is ignored for the ZSTD_fast strategy.
514
515 The minimum clog is 6 (64 B) and the maximum is 29 (524 Mib) on
516 32-bit platforms and 30 (1 Gib) on 64-bit platforms.
517
518 searchLog=slog, slog=slog
519 Specify the maximum number of searches in a hash chain or a
520 binary tree using logarithmic scale.
521
522 More searches increases the chance to find a match which usually
523 increases compression ratio but decreases compression speed.
524
525 The minimum slog is 1 and the maximum is ´windowLog´ - 1.
526
527 minMatch=mml, mml=mml
528 Specify the minimum searched length of a match in a hash table.
529
530 Larger search lengths usually decrease compression ratio but
531 improve decompression speed.
532
533 The minimum mml is 3 and the maximum is 7.
534
535 targetLength=tlen, tlen=tlen
536 The impact of this field vary depending on selected strategy.
537
538 For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
539 minimum match length that causes match finder to stop searching.
540 A larger targetLength usually improves compression ratio but
541 decreases compression speed. t For ZSTD_fast, it triggers
542 ultra-fast mode when > 0. The value represents the amount of
543 data skipped between match sampling. Impact is reversed : a
544 larger targetLength increases compression speed but decreases
545 compression ratio.
546
547 For all other strategies, this field has no impact.
548
549 The minimum tlen is 0 and the maximum is 128 Kib.
550
551 overlapLog=ovlog, ovlog=ovlog
552 Determine overlapSize, amount of data reloaded from previous
553 job. This parameter is only available when multithreading is
554 enabled. Reloading more data improves compression ratio, but
555 decreases speed.
556
557 The minimum ovlog is 0, and the maximum is 9. 1 means "no over‐
558 lap", hence completely independent jobs. 9 means "full overlap",
559 meaning up to windowSize is reloaded from previous job. Reducing
560 ovlog by 1 reduces the reloaded amount by a factor 2. For exam‐
561 ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
562 is special and means "default" : ovlog is automatically deter‐
563 mined by zstd. In which case, ovlog will range from 6 to 9,
564 depending on selected strat.
565
566 ldmHashLog=lhlog, lhlog=lhlog
567 Specify the maximum size for a hash table used for long distance
568 matching.
569
570 This option is ignored unless long distance matching is enabled.
571
572 Bigger hash tables usually improve compression ratio at the
573 expense of more memory during compression and a decrease in com‐
574 pression speed.
575
576 The minimum lhlog is 6 and the maximum is 30 (default: 20).
577
578 ldmMinMatch=lmml, lmml=lmml
579 Specify the minimum searched length of a match for long distance
580 matching.
581
582 This option is ignored unless long distance matching is enabled.
583
584 Larger/very small values usually decrease compression ratio.
585
586 The minimum lmml is 4 and the maximum is 4096 (default: 64).
587
588 ldmBucketSizeLog=lblog, lblog=lblog
589 Specify the size of each bucket for the hash table used for long
590 distance matching.
591
592 This option is ignored unless long distance matching is enabled.
593
594 Larger bucket sizes improve collision resolution but decrease
595 compression speed.
596
597 The minimum lblog is 1 and the maximum is 8 (default: 3).
598
599 ldmHashRateLog=lhrlog, lhrlog=lhrlog
600 Specify the frequency of inserting entries into the long dis‐
601 tance matching hash table.
602
603 This option is ignored unless long distance matching is enabled.
604
605 Larger values will improve compression speed. Deviating far from
606 the default value will likely result in a decrease in compres‐
607 sion ratio.
608
609 The default value is wlog - lhlog.
610
611 Example
612 The following parameters sets advanced compression options to something
613 similar to predefined level 19 for files bigger than 256 KB:
614
615 --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
616
617 -B#:
618 Select the size of each compression job. This parameter is available
619 only when multi-threading is enabled. Default value is 4 * windowSize,
620 which means it varies depending on compression level. -B# makes it pos‐
621 sible to select a custom value. Note that job size must respect a mini‐
622 mum value which is enforced transparently. This minimum is either 1 MB,
623 or overlapSize, whichever is largest.
624
626 Report bugs at: https://github.com/facebook/zstd/issues
627
629 Yann Collet
630
631
632
633zstd 1.4.8 December 2020 ZSTD(1)