zstd(1) - f31

1ZSTD(1)                          User Commands                         ZSTD(1)
2
3
4

NAME

6       zstd  -  zstd,  zstdmt,  unzstd,  zstdcat - Compress or decompress .zst
7       files
8

SYNOPSIS

10       zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12       zstdmt is equivalent to zstd -T0
13
14       unzstd is equivalent to zstd -d
15
16       zstdcat is equivalent to zstd -dcf
17

DESCRIPTION

19       zstd is a fast lossless  compression  algorithm  and  data  compression
20       tool,  with  command  line syntax similar to gzip (1) and xz (1). It is
21       based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22       offers  highly configurable compression speed, with fast modes at > 200
23       MB/s per core, and strong modes nearing  lzma  compression  ratios.  It
24       also features a very fast decoder, with speeds > 500 MB/s per core.
25
26       zstd command line syntax is generally similar to gzip, but features the
27       following differences :
28
29       ·   Source files are preserved by default. It´s possible to remove them
30           automatically by using the --rm command.
31
32       ·   When  compressing  a  single file, zstd displays progress notifica‐
33           tions and result summary by default. Use -q to turn them off.
34
35       ·   zstd does not accept input from console, but  it  properly  accepts
36           stdin when it´s not the console.
37
38       ·   zstd  displays a short help page when command line is an error. Use
39           -q to turn it off.
40
41
42
43       zstd compresses or decompresses each file  according  to  the  selected
44       operation  mode.  If  no  files are given or file is -, zstd reads from
45       standard input and writes the processed data to standard  output.  zstd
46       will refuse to write compressed data to standard output if it is a ter‐
47       minal : it will display an error message and skip the file.  Similarly,
48       zstd will refuse to read compressed data from standard input if it is a
49       terminal.
50
51       Unless --stdout or -o is specified, files are written  to  a  new  file
52       whose name is derived from the source file name:
53
54       ·   When  compressing,  the suffix .zst is appended to the source file‐
55           name to get the target filename.
56
57       ·   When decompressing, the .zst suffix  is  removed  from  the  source
58           filename to get the target filename
59
60
61
62   Concatenation with .zst files
63       It  is  possible  to concatenate .zst files as is. zstd will decompress
64       such files as if they were a single .zst file.
65

OPTIONS

67   Integer suffixes and special values
68       In most places where an integer argument is expected, an optional  suf‐
69       fix  is  supported  to easily indicate large integers. There must be no
70       space between the integer and the suffix.
71
72       KiB    Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73              as synonyms for KiB.
74
75       MiB    Multiply  the  integer  by  1,048,576  (2^20). Mi, M, and MB are
76              accepted as synonyms for MiB.
77
78   Operation mode
79       If multiple operation mode  options  are  given,  the  last  one  takes
80       effect.
81
82       -z, --compress
83              Compress.  This  is the default operation mode when no operation
84              mode option is specified and no other operation mode is  implied
85              from  the  command  name  (for  example, unzstd implies --decom‐
86              press).
87
88       -d, --decompress, --uncompress
89              Decompress.
90
91       -t, --test
92              Test the integrity of compressed files. This option  is  equiva‐
93              lent  to --decompress --stdout except that the decompressed data
94              is discarded instead of being written  to  standard  output.  No
95              files are created or removed.
96
97       -b#    Benchmark file(s) using compression level #
98
99       --train FILEs
100              Use FILEs as a training set to create a dictionary. The training
101              set should contain a lot of small files (> 100).
102
103       -l, --list
104              Display information related to a zstd compressed file,  such  as
105              size,  ratio,  and  checksum.  Some  of  these fields may not be
106              available. This command can be augmented with the -v modifier.
107
108   Operation modifiers
109       -#     # compression level [1-19] (default: 3)
110
111       --fast[=#]
112              switch to ultra-fast compression levels. If =# is  not  present,
113              it  defaults to 1. The higher the value, the faster the compres‐
114              sion speed, at the cost of some compression ratio. This  setting
115              overwrites  compression  level  if one was set previously. Simi‐
116              larly, if a compression level is set after --fast, it  overrides
117              it.
118
119       --ultra
120              unlocks  high  compression  levels 20+ (maximum 22), using a lot
121              more memory. Note that decompression will also require more mem‐
122              ory when using these levels.
123
124       --long[=#]
125              enables long distance matching with # windowLog, if not # is not
126              present it defaults to 27. This increases the window size  (win‐
127              dowLog)  and memory usage for both the compressor and decompres‐
128              sor. This setting is designed to improve the  compression  ratio
129              for files with long matches at a large distance.
130
131              Note: If windowLog is set to larger than 27, --long=windowLog or
132              --memory=windowSize needs to be passed to the decompressor.
133
134       -T#, --threads=#
135              Compress using # working  threads  (default:  1).  If  #  is  0,
136              attempt  to  detect and use the number of physical CPU cores. In
137              all   cases,   the   nb   of   threads   is   capped   to   ZST‐
138              DMT_NBTHREADS_MAX==200.  This  modifier  does nothing if zstd is
139              compiled without multithread support.
140
141       --single-thread
142              Does not spawn a thread for compression, use a single thread for
143              both  I/O  and compression. In this mode, compression is serial‐
144              ized with I/O, which is slightly slower. (This is different from
145              -T1, which spawns 1 compression thread in parallel of I/O). This
146              mode is the only one available when multithread support is  dis‐
147              abled.  Single-thread  mode  features  lower memory usage. Final
148              compressed result is slightly different from -T1.
149
150       --adapt[=min=#,max=#]
151              zstd will dynamically adapt compression level to  perceived  I/O
152              conditions. Compression level adaptation can be observed live by
153              using command -v. Adaptation can be constrained between supplied
154              min  and  max  levels.  The  feature  works  when  combined with
155              multi-threading and --long mode. It does not  work  with  --sin‐
156              gle-thread.  It  sets  window  size  to  8 MB by default (can be
157              changed manually, see  wlog).  Due  to  the  chaotic  nature  of
158              dynamic  adaptation, compressed result is not reproducible. note
159              : at the time of this writing, --adapt can remain stuck  at  low
160              speed when combined with multiple worker threads (>=2).
161
162       --rsyncable
163              zstd will periodically synchronize the compression state to make
164              the compressed file more rsync-friendly. There is  a  negligible
165              impact  to  compression ratio, and the faster compression levels
166              will see a small compression speed hit. This  feature  does  not
167              work  with  --single-thread.  You  probably don´t want to use it
168              with long range mode, since it will decrease  the  effectiveness
169              of the synchronization points, but your milage may vary.
170
171       -D file
172              use file as Dictionary to compress or decompress FILE(s)
173
174       --no-dictID
175              do  not store dictionary ID within frame header (dictionary com‐
176              pression). The decoder will have to rely on  implicit  knowledge
177              about which dictionary to use, it won´t be able to check if it´s
178              correct.
179
180       -o file
181              save result into file (only possible with a single INPUT-FILE)
182
183       -f, --force
184              overwrite output without prompting,  and  (de)compress  symbolic
185              links
186
187       -c, --stdout
188              force write to standard output, even if it is the console
189
190       --[no-]sparse
191              enable  /  disable  sparse  FS  support, to make files with many
192              zeroes smaller on disk. Creating  sparse  files  may  save  disk
193              space  and speed up decompression by reducing the amount of disk
194              I/O. default: enabled when output is into a file,  and  disabled
195              when  output  is  stdout. This setting overrides default and can
196              force sparse mode over stdout.
197
198       --rm   remove source file(s) after successful compression or decompres‐
199              sion
200
201       -k, --keep
202              keep  source  file(s) after successful compression or decompres‐
203              sion. This is the default behavior.
204
205       -r     operate recursively on directories
206
207       --format=FORMAT
208              compress and decompress in other formats. If compiled with  sup‐
209              port,  zstd can compress to or decompress from other compression
210              algorithm formats. Possibly available options  are  zstd,  gzip,
211              xz,  lzma,  and  lz4. If no such format is provided, zstd is the
212              default.
213
214       -h/-H, --help
215              display help/long help and exit
216
217       -V, --version
218              display version number and exit. Advanced :  -vV  also  displays
219              supported formats. -vvV also displays POSIX support.
220
221       -v     verbose mode
222
223       -q, --quiet
224              suppress  warnings,  interactivity,  and  notifications. specify
225              twice to suppress errors too.
226
227       --no-progress
228              do not display the progress bar, but keep all other messages.
229
230       -C, --[no-]check
231              add integrity check computed from  uncompressed  data  (default:
232              enabled)
233
234       --     All arguments after -- are treated as files
235
236

Parallel Zstd OPTIONS

238       Additional options for the pzstd utility
239
240       -p, --processes
241               number of threads to use for (de)compression (default:4)
242
243
244

DICTIONARY BUILDER

246       zstd  offers  dictionary compression, which greatly improves efficiency
247       on small files and messages. It´s possible to train zstd with a set  of
248       samples,  the result of which is saved into a file called a dictionary.
249       Then during compression and decompression, reference the  same  dictio‐
250       nary,  using  command -D dictionaryFileName. Compression of small files
251       similar to the sample set will be greatly improved.
252
253       --train FILEs
254              Use FILEs as training set to create a dictionary.  The  training
255              set should contain a lot of small files (> 100), and weight typ‐
256              ically 100x the target dictionary size (for example, 10 MB for a
257              100 KB dictionary).
258
259              Supports  multithreading if zstd is compiled with threading sup‐
260              port. Additional parameters can be specified with  --train-fast‐
261              cover.  The  legacy  dictionary  builder  can  be  accessed with
262              --train-legacy. The cover dictionary  builder  can  be  accessed
263              with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
264
265       -o file
266              Dictionary saved into file (default name: dictionary).
267
268       --maxdict=#
269              Limit dictionary to specified size (default: 112640).
270
271       -#     Use  # compression level during training (optional). Will gener‐
272              ate  statistics  more  tuned  for  selected  compression  level,
273              resulting  in  a  small  compression  ratio improvement for this
274              level.
275
276       -B#    Split input files in blocks of size # (default: no split)
277
278       --dictID=#
279              A dictionary ID is a locally unique ID that a decoder can use to
280              verify  it  is using the right dictionary. By default, zstd will
281              create a 4-bytes random number ID. It´s possible to give a  pre‐
282              cise  number  instead. Short numbers have an advantage : an ID <
283              256 will only need 1 byte in the compressed frame header, and an
284              ID  < 65536 will only need 2 bytes. This compares favorably to 4
285              bytes default. However, it´s up to the dictionary manager to not
286              assign twice the same ID to 2 different dictionaries.
287
288       --train-cover[=k#,d=#,steps=#,split=#,shrink[=#]]
289              Select  parameters  for the default dictionary builder algorithm
290              named cover. If d is not specified, then it tries d = 6 and d  =
291              8.  If  k  is  not  specified, then it tries steps values in the
292              range [50, 2000]. If steps is not specified,  then  the  default
293              value  of  40  is used. If split is not specified or split <= 0,
294              then the default value of 100 is used. Requires that d <= k.  If
295              shrink  flag  is not used, then the default value for shrinkDict
296              of 0 is used. If shrink is not specified, then the default value
297              for shrinkDictMaxRegression of 1 is used.
298
299              Selects segments of size k with highest score to put in the dic‐
300              tionary. The score of a segment is computed by the  sum  of  the
301              frequencies of all the subsegments of size d. Generally d should
302              be in the range [6, 8], occasionally up to 16, but the algorithm
303              will run faster with d <= 8. Good values for k vary widely based
304              on the input data, but a safe range is [2 * d, 2000].  If  split
305              is 100, all input samples are used for both training and testing
306              to find optimal d and k to  build  dictionary.  Supports  multi‐
307              threading  if  zstd  is  compiled with threading support. Having
308              shrink enabled takes a truncated dictionary of minimum size  and
309              doubles in size until compression ratio of the truncated dictio‐
310              nary is at most shrinkDictMaxRegression% worse than the compres‐
311              sion ratio of the largest dictionary.
312
313              Examples:
314
315              zstd --train-cover FILEs
316
317              zstd --train-cover=k=50,d=8 FILEs
318
319              zstd --train-cover=d=8,steps=500 FILEs
320
321              zstd --train-cover=k=50 FILEs
322
323              zstd --train-cover=k=50,split=60 FILEs
324
325              zstd --train-cover=shrink FILEs
326
327              zstd --train-cover=shrink=2 FILEs
328
329       --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
330              Same  as cover but with extra parameters f and accel and differ‐
331              ent default value of split If split is not  specified,  then  it
332              tries  split  = 75. If f is not specified, then it tries f = 20.
333              Requires that 0 < f < 32. If accel is  not  specified,  then  it
334              tries  accel = 1. Requires that 0 < accel <= 10. Requires that d
335              = 6 or d = 8.
336
337              f is log of size of array that keeps track of frequency of  sub‐
338              segments  of size d. The subsegment is hashed to an index in the
339              range [0,2^f - 1]. It is possible that 2  different  subsegments
340              are  hashed  to  the  same index, and they are considered as the
341              same subsegment when  computing  frequency.  Using  a  higher  f
342              reduces collision but takes longer.
343
344              Examples:
345
346              zstd --train-fastcover FILEs
347
348              zstd --train-fastcover=d=8,f=15,accel=2 FILEs
349
350       --train-legacy[=selectivity=#]
351              Use  legacy  dictionary builder algorithm with the given dictio‐
352              nary selectivity  (default:  9).  The  smaller  the  selectivity
353              value,  the  denser the dictionary, improving its efficiency but
354              reducing its possible maximum size. --train-legacy=s=#  is  also
355              accepted.
356
357              Examples:
358
359              zstd --train-legacy FILEs
360
361              zstd --train-legacy=selectivity=8 FILEs
362

BENCHMARK

364       -b#    benchmark file(s) using compression level #
365
366       -e#    benchmark file(s) using multiple compression levels, from -b# to
367              -e# (inclusive)
368
369       -i#    minimum evaluation time, in  seconds  (default:  3s),  benchmark
370              mode only
371
372       -B#, --block-size=#
373              cut  file(s)  into  independent  blocks  of  size # (default: no
374              block)
375
376       --priority=rt
377              set process priority to real-time
378
379       Output Format: CompressionLevel#Filename  :  IntputSize  ->  OutputSize
380       (CompressionRatio), CompressionSpeed, DecompressionSpeed
381
382       Methodology:  For  both compression and decompression speed, the entire
383       input is compressed/decompressed in-memory  to  measure  speed.  A  run
384       lasts  at  least  1  sec,  so  when  files  are  small,  they  are com‐
385       pressed/decompressed several times per run, in order  to  improve  mea‐
386       surement accuracy.
387

ADVANCED COMPRESSION OPTIONS

389   --zstd[=options]:
390       zstd provides 22 predefined compression levels. The selected or default
391       predefined compression level can be changed with  advanced  compression
392       options.  The  options  are provided as a comma-separated list. You may
393       specify only the options you want to change and the rest will be  taken
394       from  the  selected or default compression level. The list of available
395       options:
396
397       strategy=strat, strat=strat
398              Specify a strategy used by a match finder.
399
400              There are 9 strategies numbered from 1  to  9,  from  faster  to
401              stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
402              5=ZSTD_lazy2,  6=ZSTD_btlazy2,   7=ZSTD_btopt,   8=ZSTD_btultra,
403              9=ZSTD_btultra2.
404
405       windowLog=wlog, wlog=wlog
406              Specify the maximum number of bits for a match distance.
407
408              The  higher number of increases the chance to find a match which
409              usually improves compression ratio.  It  also  increases  memory
410              requirements  for  the  compressor and decompressor. The minimum
411              wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
412              forms and 31 (2 GiB) on 64-bit platforms.
413
414              Note: If windowLog is set to larger than 27, --long=windowLog or
415              --memory=windowSize needs to be passed to the decompressor.
416
417       hashLog=hlog, hlog=hlog
418              Specify the maximum number of bits for a hash table.
419
420              Bigger hash tables cause less  collisions  which  usually  makes
421              compression faster, but requires more memory during compression.
422
423              The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
424
425       chainLog=clog, clog=clog
426              Specify  the maximum number of bits for a hash chain or a binary
427              tree.
428
429              Higher numbers of bits increases the  chance  to  find  a  match
430              which  usually  improves  compression  ratio. It also slows down
431              compression speed and increases memory requirements for compres‐
432              sion. This option is ignored for the ZSTD_fast strategy.
433
434              The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
435
436       searchLog=slog, slog=slog
437              Specify  the  maximum  number  of  searches in a hash chain or a
438              binary tree using logarithmic scale.
439
440              More searches increases the chance to find a match which usually
441              increases compression ratio but decreases compression speed.
442
443              The minimum slog is 1 and the maximum is 26.
444
445       minMatch=mml, mml=mml
446              Specify the minimum searched length of a match in a hash table.
447
448              Larger  search  lengths  usually  decrease compression ratio but
449              improve decompression speed.
450
451              The minimum mml is 3 and the maximum is 7.
452
453       targetLen=tlen, tlen=tlen
454              The impact of this field vary depending on selected strategy.
455
456              For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
457              minimum match length that causes match finder to stop searching.
458              A  larger  targetLen  usually  improves  compression  ratio  but
459              decreases compression speed.
460
461              For  ZSTD_fast,  it triggers ultra-fast mode when > 0. The value
462              represents the amount of data skipped  between  match  sampling.
463              Impact  is  reversed  : a larger targetLen increases compression
464              speed but decreases compression ratio.
465
466              For all other strategies, this field has no impact.
467
468              The minimum tlen is 0 and the maximum is 999.
469
470       overlapLog=ovlog, ovlog=ovlog
471              Determine overlapSize, amount of  data  reloaded  from  previous
472              job.  This  parameter  is  only available when multithreading is
473              enabled. Reloading more data  improves  compression  ratio,  but
474              decreases speed.
475
476              The  minimum ovlog is 0, and the maximum is 9. 1 means "no over‐
477              lap", hence completely independent jobs. 9 means "full overlap",
478              meaning up to windowSize is reloaded from previous job. Reducing
479              ovlog by 1 reduces the reloaded amount by a factor 2. For  exam‐
480              ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
481              is special and means "default" : ovlog is  automatically  deter‐
482              mined  by  zstd.  In  which  case, ovlog will range from 6 to 9,
483              depending on selected strat.
484
485       ldmHashLog=lhlog, lhlog=lhlog
486              Specify the maximum size for a hash table used for long distance
487              matching.
488
489              This option is ignored unless long distance matching is enabled.
490
491              Bigger  hash  tables  usually  improve  compression ratio at the
492              expense of more memory during compression and a decrease in com‐
493              pression speed.
494
495              The minimum lhlog is 6 and the maximum is 26 (default: 20).
496
497       ldmMinMatch=lmml, lmml=lmml
498              Specify the minimum searched length of a match for long distance
499              matching.
500
501              This option is ignored unless long distance matching is enabled.
502
503              Larger/very small values usually decrease compression ratio.
504
505              The minimum lmml is 4 and the maximum is 4096 (default: 64).
506
507       ldmBucketSizeLog=lblog, lblog=lblog
508              Specify the size of each bucket for the hash table used for long
509              distance matching.
510
511              This option is ignored unless long distance matching is enabled.
512
513              Larger  bucket  sizes  improve collision resolution but decrease
514              compression speed.
515
516              The minimum lblog is 0 and the maximum is 8 (default: 3).
517
518       ldmHashRateLog=lhrlog, lhrlog=lhrlog
519              Specify the frequency of inserting entries into  the  long  dis‐
520              tance matching hash table.
521
522              This option is ignored unless long distance matching is enabled.
523
524              Larger values will improve compression speed. Deviating far from
525              the default value will likely result in a decrease  in  compres‐
526              sion ratio.
527
528              The default value is wlog - lhlog.
529
530   Example
531       The following parameters sets advanced compression options to something
532       similar to predefined level 19 for files bigger than 256 KB:
533
534       --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
535
536   -B#:
537       Select the size of each compression job. This  parameter  is  available
538       only  when multi-threading is enabled. Default value is 4 * windowSize,
539       which means it varies depending on compression level. -B# makes it pos‐
540       sible to select a custom value. Note that job size must respect a mini‐
541       mum value which is enforced transparently. This minimum is either 1 MB,
542       or overlapSize, whichever is largest.
543

BUGS

545       Report bugs at: https://github.com/facebook/zstd/issues
546

AUTHOR

548       Yann Collet
549
550
551
552zstd 1.4.2                         July 2019                           ZSTD(1)