unzstd(1)

1ZSTD(1)                          User Commands                         ZSTD(1)
2
3
4

NAME

6       zstd  -  zstd,  zstdmt,  unzstd,  zstdcat - Compress or decompress .zst
7       files
8

SYNOPSIS

10       zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12       zstdmt is equivalent to zstd -T0
13
14       unzstd is equivalent to zstd -d
15
16       zstdcat is equivalent to zstd -dcf
17

DESCRIPTION

19       zstd is a fast lossless  compression  algorithm  and  data  compression
20       tool,  with  command  line syntax similar to gzip (1) and xz (1). It is
21       based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22       offers  highly configurable compression speed, with fast modes at > 200
23       MB/s per core, and strong modes nearing  lzma  compression  ratios.  It
24       also features a very fast decoder, with speeds > 500 MB/s per core.
25
26       zstd command line syntax is generally similar to gzip, but features the
27       following differences :
28
29       ·   Source files are preserved by default. It´s possible to remove them
30           automatically by using the --rm command.
31
32       ·   When  compressing  a  single file, zstd displays progress notifica‐
33           tions and result summary by default. Use -q to turn them off.
34
35       ·   zstd does not accept input from console, but  it  properly  accepts
36           stdin when it´s not the console.
37
38       ·   zstd  displays a short help page when command line is an error. Use
39           -q to turn it off.
40
41
42
43       zstd compresses or decompresses each file  according  to  the  selected
44       operation  mode.  If  no  files are given or file is -, zstd reads from
45       standard input and writes the processed data to standard  output.  zstd
46       will refuse to write compressed data to standard output if it is a ter‐
47       minal : it will display an error message and skip the file.  Similarly,
48       zstd will refuse to read compressed data from standard input if it is a
49       terminal.
50
51       Unless --stdout or -o is specified, files are written  to  a  new  file
52       whose name is derived from the source file name:
53
54       ·   When  compressing,  the suffix .zst is appended to the source file‐
55           name to get the target filename.
56
57       ·   When decompressing, the .zst suffix  is  removed  from  the  source
58           filename to get the target filename
59
60
61
62   Concatenation with .zst files
63       It  is  possible  to concatenate .zst files as is. zstd will decompress
64       such files as if they were a single .zst file.
65

OPTIONS

67   Integer suffixes and special values
68       In most places where an integer argument is expected, an optional  suf‐
69       fix  is  supported  to easily indicate large integers. There must be no
70       space between the integer and the suffix.
71
72       KiB    Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73              as synonyms for KiB.
74
75       MiB    Multiply  the  integer  by  1,048,576  (2^20). Mi, M, and MB are
76              accepted as synonyms for MiB.
77
78   Operation mode
79       If multiple operation mode  options  are  given,  the  last  one  takes
80       effect.
81
82       -z, --compress
83              Compress.  This  is the default operation mode when no operation
84              mode option is specified and no other operation mode is  implied
85              from  the  command  name  (for  example, unzstd implies --decom‐
86              press).
87
88       -d, --decompress, --uncompress
89              Decompress.
90
91       -t, --test
92              Test the integrity of compressed files. This option  is  equiva‐
93              lent  to --decompress --stdout except that the decompressed data
94              is discarded instead of being written  to  standard  output.  No
95              files are created or removed.
96
97       -b#    Benchmark file(s) using compression level #
98
99       --train FILEs
100              Use FILEs as a training set to create a dictionary. The training
101              set should contain a lot of small files (> 100).
102
103       -l, --list
104              Display information related to a zstd compressed file,  such  as
105              size,  ratio,  and  checksum.  Some  of  these fields may not be
106              available. This command can be augmented with the -v modifier.
107
108   Operation modifiers
109       -#     # compression level [1-19] (default: 3)
110
111       --fast[=#]
112              switch to ultra-fast compression levels. If =# is  not  present,
113              it  defaults to 1. The higher the value, the faster the compres‐
114              sion speed, at the cost of some compression ratio. This  setting
115              overwrites  compression  level  if one was set previously. Simi‐
116              larly, if a compression level is set after --fast, it  overrides
117              it.
118
119       --ultra
120              unlocks  high  compression  levels 20+ (maximum 22), using a lot
121              more memory. Note that decompression will also require more mem‐
122              ory when using these levels.
123
124       --long[=#]
125              enables long distance matching with # windowLog, if not # is not
126              present it defaults to 27. This increases the window size  (win‐
127              dowLog)  and memory usage for both the compressor and decompres‐
128              sor. This setting is designed to improve the  compression  ratio
129              for files with long matches at a large distance.
130
131              Note: If windowLog is set to larger than 27, --long=windowLog or
132              --memory=windowSize needs to be passed to the decompressor.
133
134       -T#, --threads=#
135              Compress using # working  threads  (default:  1).  If  #  is  0,
136              attempt  to  detect and use the number of physical CPU cores. In
137              all   cases,   the   nb   of   threads   is   capped   to   ZST‐
138              DMT_NBTHREADS_MAX==200.  This  modifier  does nothing if zstd is
139              compiled without multithread support.
140
141       --single-thread
142              Does not spawn a thread for compression, use a single thread for
143              both  I/O  and compression. In this mode, compression is serial‐
144              ized with I/O, which is slightly slower. (This is different from
145              -T1, which spawns 1 compression thread in parallel of I/O). This
146              mode is the only one available when multithread support is  dis‐
147              abled.  Single-thread  mode  features  lower memory usage. Final
148              compressed result is slightly different from -T1.
149
150       --adapt[=min=#,max=#]
151              zstd will dynamically adapt compression level to  perceived  I/O
152              conditions. Compression level adaptation can be observed live by
153              using command -v. Adaptation can be constrained between supplied
154              min  and  max  levels.  The  feature  works  when  combined with
155              multi-threading and --long mode. It does not  work  with  --sin‐
156              gle-thread.  It  sets  window  size  to  8 MB by default (can be
157              changed manually, see  wlog).  Due  to  the  chaotic  nature  of
158              dynamic  adaptation, compressed result is not reproducible. note
159              : at the time of this writing, --adapt can remain stuck  at  low
160              speed when combined with multiple worker threads (>=2).
161
162       --stream-size=#
163              Sets the pledged source size of input coming from a stream. This
164              value must be exact, as it will  be  included  in  the  produced
165              frame  header.  Incorrect stream sizes will cause an error. This
166              information will be used to better optimize compression  parame‐
167              ters,  resulting  in  better and potentially faster compression,
168              especially for smaller source sizes.
169
170       --size-hint=#
171              When handling input from a stream, zstd must guess how large the
172              source  size  will be when optimizing compression parameters. If
173              the stream size is relatively small, this guess may  be  a  poor
174              one, resulting in a higher compression ratio than expected. This
175              feature allows for controlling  the  guess  when  needed.  Exact
176              guesses  result  in  better  compression  ratios.  Overestimates
177              result in slightly degraded compression ratios, while underesti‐
178              mates may result in significant degradation.
179
180       --rsyncable
181              zstd will periodically synchronize the compression state to make
182              the compressed file more rsync-friendly. There is  a  negligible
183              impact  to  compression ratio, and the faster compression levels
184              will see a small compression speed hit. This  feature  does  not
185              work  with  --single-thread.  You  probably don´t want to use it
186              with long range mode, since it will decrease  the  effectiveness
187              of the synchronization points, but your milage may vary.
188
189       -D file
190              use file as Dictionary to compress or decompress FILE(s)
191
192       --no-dictID
193              do  not store dictionary ID within frame header (dictionary com‐
194              pression). The decoder will have to rely on  implicit  knowledge
195              about which dictionary to use, it won´t be able to check if it´s
196              correct.
197
198       -o file
199              save result into file (only possible with a single INPUT-FILE)
200
201       -f, --force
202              overwrite output without prompting,  and  (de)compress  symbolic
203              links
204
205       -c, --stdout
206              force write to standard output, even if it is the console
207
208       --[no-]sparse
209              enable  /  disable  sparse  FS  support, to make files with many
210              zeroes smaller on disk. Creating  sparse  files  may  save  disk
211              space  and speed up decompression by reducing the amount of disk
212              I/O. default: enabled when output is into a file,  and  disabled
213              when  output  is  stdout. This setting overrides default and can
214              force sparse mode over stdout.
215
216       --rm   remove source file(s) after successful compression or decompres‐
217              sion
218
219       -k, --keep
220              keep  source  file(s) after successful compression or decompres‐
221              sion. This is the default behavior.
222
223       -r     operate recursively on directories
224
225       --output-dir-flat[=dir]
226              resulting files are stored into target dir directory, instead of
227              same  directory  as  origin file. Be aware that this command can
228              introduce name collision issues, if multiple files, from differ‐
229              ent  directories, end up having the same name. Collision resolu‐
230              tion ensures first file with a given name  will  be  present  in
231              dir, while in combination with -f, the last file will be present
232              instead.
233
234       --format=FORMAT
235              compress and decompress in other formats. If compiled with  sup‐
236              port,  zstd can compress to or decompress from other compression
237              algorithm formats. Possibly available options  are  zstd,  gzip,
238              xz,  lzma,  and  lz4. If no such format is provided, zstd is the
239              default.
240
241       -h/-H, --help
242              display help/long help and exit
243
244       -V, --version
245              display version number and exit. Advanced :  -vV  also  displays
246              supported formats. -vvV also displays POSIX support.
247
248       -v     verbose mode
249
250       -q, --quiet
251              suppress  warnings,  interactivity,  and  notifications. specify
252              twice to suppress errors too.
253
254       --no-progress
255              do not display the progress bar, but keep all other messages.
256
257       -C, --[no-]check
258              add integrity check computed from  uncompressed  data  (default:
259              enabled)
260
261       --     All arguments after -- are treated as files
262
263

Parallel Zstd OPTIONS

265       Additional options for the pzstd utility
266
267       -p, --processes
268               number of threads to use for (de)compression (default:4)
269
270
271
272   Restricted usage of Environment Variables
273       Using  environment  variables  to  set parameters has security implica‐
274       tions.  Therefore,  this  avenue  is  intentionally  restricted.   Only
275       ZSTD_CLEVEL  is  supported  currently,  for  setting compression level.
276       ZSTD_CLEVEL can be used to set the level between 1 and 19 (the "normal"
277       range).  If the value of ZSTD_CLEVEL is not a valid integer, it will be
278       ignored with a warning message. ZSTD_CLEVEL just replaces  the  default
279       compression  level  (3).  It can be overridden by corresponding command
280       line arguments.
281

DICTIONARY BUILDER

283       zstd offers dictionary compression, which greatly  improves  efficiency
284       on  small files and messages. It´s possible to train zstd with a set of
285       samples, the result of which is saved into a file called a  dictionary.
286       Then  during  compression and decompression, reference the same dictio‐
287       nary, using command -D dictionaryFileName. Compression of  small  files
288       similar to the sample set will be greatly improved.
289
290       --train FILEs
291              Use  FILEs  as training set to create a dictionary. The training
292              set should contain a lot of small files (> 100), and weight typ‐
293              ically 100x the target dictionary size (for example, 10 MB for a
294              100 KB dictionary).
295
296              Supports multithreading if zstd is compiled with threading  sup‐
297              port.  Additional parameters can be specified with --train-fast‐
298              cover. The  legacy  dictionary  builder  can  be  accessed  with
299              --train-legacy.  The  cover  dictionary  builder can be accessed
300              with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
301
302       -o file
303              Dictionary saved into file (default name: dictionary).
304
305       --maxdict=#
306              Limit dictionary to specified size (default: 112640).
307
308       -#     Use # compression level during training (optional). Will  gener‐
309              ate  statistics  more  tuned  for  selected  compression  level,
310              resulting in a small  compression  ratio  improvement  for  this
311              level.
312
313       -B#    Split input files in blocks of size # (default: no split)
314
315       --dictID=#
316              A dictionary ID is a locally unique ID that a decoder can use to
317              verify it is using the right dictionary. By default,  zstd  will
318              create  a 4-bytes random number ID. It´s possible to give a pre‐
319              cise number instead. Short numbers have an advantage : an  ID  <
320              256 will only need 1 byte in the compressed frame header, and an
321              ID < 65536 will only need 2 bytes. This compares favorably to  4
322              bytes default. However, it´s up to the dictionary manager to not
323              assign twice the same ID to 2 different dictionaries.
324
325       --train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]
326              Select parameters for the default dictionary  builder  algorithm
327              named  cover. If d is not specified, then it tries d = 6 and d =
328              8. If k is not specified, then it  tries  steps  values  in  the
329              range  [50,  2000].  If steps is not specified, then the default
330              value of 40 is used. If split is not specified or  split  <=  0,
331              then  the default value of 100 is used. Requires that d <= k. If
332              shrink flag is not used, then the default value  for  shrinkDict
333              of 0 is used. If shrink is not specified, then the default value
334              for shrinkDictMaxRegression of 1 is used.
335
336              Selects segments of size k with highest score to put in the dic‐
337              tionary.  The  score  of a segment is computed by the sum of the
338              frequencies of all the subsegments of size d. Generally d should
339              be in the range [6, 8], occasionally up to 16, but the algorithm
340              will run faster with d <= 8. Good values for k vary widely based
341              on  the  input data, but a safe range is [2 * d, 2000]. If split
342              is 100, all input samples are used for both training and testing
343              to  find  optimal  d  and k to build dictionary. Supports multi‐
344              threading if zstd is compiled  with  threading  support.  Having
345              shrink  enabled takes a truncated dictionary of minimum size and
346              doubles in size until compression ratio of the truncated dictio‐
347              nary is at most shrinkDictMaxRegression% worse than the compres‐
348              sion ratio of the largest dictionary.
349
350              Examples:
351
352              zstd --train-cover FILEs
353
354              zstd --train-cover=k=50,d=8 FILEs
355
356              zstd --train-cover=d=8,steps=500 FILEs
357
358              zstd --train-cover=k=50 FILEs
359
360              zstd --train-cover=k=50,split=60 FILEs
361
362              zstd --train-cover=shrink FILEs
363
364              zstd --train-cover=shrink=2 FILEs
365
366       --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
367              Same as cover but with extra parameters f and accel and  differ‐
368              ent  default  value  of split If split is not specified, then it
369              tries split = 75. If f is not specified, then it tries f  =  20.
370              Requires  that  0  <  f < 32. If accel is not specified, then it
371              tries accel = 1. Requires that 0 < accel <= 10. Requires that  d
372              = 6 or d = 8.
373
374              f  is log of size of array that keeps track of frequency of sub‐
375              segments of size d. The subsegment is hashed to an index in  the
376              range  [0,2^f  - 1]. It is possible that 2 different subsegments
377              are hashed to the same index, and they  are  considered  as  the
378              same  subsegment  when  computing  frequency.  Using  a higher f
379              reduces collision but takes longer.
380
381              Examples:
382
383              zstd --train-fastcover FILEs
384
385              zstd --train-fastcover=d=8,f=15,accel=2 FILEs
386
387       --train-legacy[=selectivity=#]
388              Use legacy dictionary builder algorithm with the  given  dictio‐
389              nary  selectivity  (default:  9).  The  smaller  the selectivity
390              value, the denser the dictionary, improving its  efficiency  but
391              reducing  its  possible maximum size. --train-legacy=s=# is also
392              accepted.
393
394              Examples:
395
396              zstd --train-legacy FILEs
397
398              zstd --train-legacy=selectivity=8 FILEs
399

BENCHMARK

401       -b#    benchmark file(s) using compression level #
402
403       -e#    benchmark file(s) using multiple compression levels, from -b# to
404              -e# (inclusive)
405
406       -i#    minimum  evaluation  time,  in  seconds (default: 3s), benchmark
407              mode only
408
409       -B#, --block-size=#
410              cut file(s) into independent  blocks  of  size  #  (default:  no
411              block)
412
413       --priority=rt
414              set process priority to real-time
415
416       Output  Format:  CompressionLevel#Filename  :  IntputSize -> OutputSize
417       (CompressionRatio), CompressionSpeed, DecompressionSpeed
418
419       Methodology: For both compression and decompression speed,  the  entire
420       input  is  compressed/decompressed  in-memory  to  measure speed. A run
421       lasts at  least  1  sec,  so  when  files  are  small,  they  are  com‐
422       pressed/decompressed  several  times  per run, in order to improve mea‐
423       surement accuracy.
424

ADVANCED COMPRESSION OPTIONS

426   --zstd[=options]:
427       zstd provides 22 predefined compression levels. The selected or default
428       predefined  compression  level can be changed with advanced compression
429       options. The options are provided as a comma-separated  list.  You  may
430       specify  only the options you want to change and the rest will be taken
431       from the selected or default compression level. The list  of  available
432       options:
433
434       strategy=strat, strat=strat
435              Specify a strategy used by a match finder.
436
437              There  are  9  strategies  numbered  from 1 to 9, from faster to
438              stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
439              5=ZSTD_lazy2,   6=ZSTD_btlazy2,   7=ZSTD_btopt,  8=ZSTD_btultra,
440              9=ZSTD_btultra2.
441
442       windowLog=wlog, wlog=wlog
443              Specify the maximum number of bits for a match distance.
444
445              The higher number of increases the chance to find a match  which
446              usually  improves  compression  ratio.  It also increases memory
447              requirements for the compressor and  decompressor.  The  minimum
448              wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
449              forms and 31 (2 GiB) on 64-bit platforms.
450
451              Note: If windowLog is set to larger than 27, --long=windowLog or
452              --memory=windowSize needs to be passed to the decompressor.
453
454       hashLog=hlog, hlog=hlog
455              Specify the maximum number of bits for a hash table.
456
457              Bigger  hash  tables  cause  less collisions which usually makes
458              compression faster, but requires more memory during compression.
459
460              The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
461
462       chainLog=clog, clog=clog
463              Specify the maximum number of bits for a hash chain or a  binary
464              tree.
465
466              Higher  numbers  of  bits  increases  the chance to find a match
467              which usually improves compression ratio.  It  also  slows  down
468              compression speed and increases memory requirements for compres‐
469              sion. This option is ignored for the ZSTD_fast strategy.
470
471              The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
472
473       searchLog=slog, slog=slog
474              Specify the maximum number of searches in  a  hash  chain  or  a
475              binary tree using logarithmic scale.
476
477              More searches increases the chance to find a match which usually
478              increases compression ratio but decreases compression speed.
479
480              The minimum slog is 1 and the maximum is 26.
481
482       minMatch=mml, mml=mml
483              Specify the minimum searched length of a match in a hash table.
484
485              Larger search lengths usually  decrease  compression  ratio  but
486              improve decompression speed.
487
488              The minimum mml is 3 and the maximum is 7.
489
490       targetLen=tlen, tlen=tlen
491              The impact of this field vary depending on selected strategy.
492
493              For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
494              minimum match length that causes match finder to stop searching.
495              A  larger  targetLen  usually  improves  compression  ratio  but
496              decreases compression speed.
497
498              For ZSTD_fast, it triggers ultra-fast mode when > 0.  The  value
499              represents  the  amount  of data skipped between match sampling.
500              Impact is reversed : a larger  targetLen  increases  compression
501              speed but decreases compression ratio.
502
503              For all other strategies, this field has no impact.
504
505              The minimum tlen is 0 and the maximum is 999.
506
507       overlapLog=ovlog, ovlog=ovlog
508              Determine  overlapSize,  amount  of  data reloaded from previous
509              job. This parameter is only  available  when  multithreading  is
510              enabled.  Reloading  more  data  improves compression ratio, but
511              decreases speed.
512
513              The minimum ovlog is 0, and the maximum is 9. 1 means "no  over‐
514              lap", hence completely independent jobs. 9 means "full overlap",
515              meaning up to windowSize is reloaded from previous job. Reducing
516              ovlog  by 1 reduces the reloaded amount by a factor 2. For exam‐
517              ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
518              is  special  and means "default" : ovlog is automatically deter‐
519              mined by zstd. In which case, ovlog will  range  from  6  to  9,
520              depending on selected strat.
521
522       ldmHashLog=lhlog, lhlog=lhlog
523              Specify the maximum size for a hash table used for long distance
524              matching.
525
526              This option is ignored unless long distance matching is enabled.
527
528              Bigger hash tables usually  improve  compression  ratio  at  the
529              expense of more memory during compression and a decrease in com‐
530              pression speed.
531
532              The minimum lhlog is 6 and the maximum is 26 (default: 20).
533
534       ldmMinMatch=lmml, lmml=lmml
535              Specify the minimum searched length of a match for long distance
536              matching.
537
538              This option is ignored unless long distance matching is enabled.
539
540              Larger/very small values usually decrease compression ratio.
541
542              The minimum lmml is 4 and the maximum is 4096 (default: 64).
543
544       ldmBucketSizeLog=lblog, lblog=lblog
545              Specify the size of each bucket for the hash table used for long
546              distance matching.
547
548              This option is ignored unless long distance matching is enabled.
549
550              Larger bucket sizes improve collision  resolution  but  decrease
551              compression speed.
552
553              The minimum lblog is 0 and the maximum is 8 (default: 3).
554
555       ldmHashRateLog=lhrlog, lhrlog=lhrlog
556              Specify  the  frequency  of inserting entries into the long dis‐
557              tance matching hash table.
558
559              This option is ignored unless long distance matching is enabled.
560
561              Larger values will improve compression speed. Deviating far from
562              the  default  value will likely result in a decrease in compres‐
563              sion ratio.
564
565              The default value is wlog - lhlog.
566
567   Example
568       The following parameters sets advanced compression options to something
569       similar to predefined level 19 for files bigger than 256 KB:
570
571       --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
572
573   -B#:
574       Select  the  size  of each compression job. This parameter is available
575       only when multi-threading is enabled. Default value is 4 *  windowSize,
576       which means it varies depending on compression level. -B# makes it pos‐
577       sible to select a custom value. Note that job size must respect a mini‐
578       mum value which is enforced transparently. This minimum is either 1 MB,
579       or overlapSize, whichever is largest.
580

BUGS

582       Report bugs at: https://github.com/facebook/zstd/issues
583

AUTHOR

585       Yann Collet
586
587
588
589zstd 1.4.4                       October 2019                          ZSTD(1)