zstd(1) - f29

1ZSTD(1)                          User Commands                         ZSTD(1)
2
3
4

NAME

6       zstd  -  zstd,  zstdmt,  unzstd,  zstdcat - Compress or decompress .zst
7       files
8

SYNOPSIS

10       zstd [OPTIONS] [-|INPUT-FILE] [-o OUTPUT-FILE]
11
12       zstdmt is equivalent to zstd -T0
13
14       unzstd is equivalent to zstd -d
15
16       zstdcat is equivalent to zstd -dcf
17

DESCRIPTION

19       zstd is a fast lossless  compression  algorithm  and  data  compression
20       tool,  with  command  line syntax similar to gzip (1) and xz (1). It is
21       based on the LZ77 family, with further FSE & huff0 entropy stages. zstd
22       offers  highly configurable compression speed, with fast modes at > 200
23       MB/s per core, and strong modes nearing  lzma  compression  ratios.  It
24       also features a very fast decoder, with speeds > 500 MB/s per core.
25
26       zstd command line syntax is generally similar to gzip, but features the
27       following differences :
28
29       ·   Source files are preserved by default. It´s possible to remove them
30           automatically by using the --rm command.
31
32       ·   When  compressing  a  single file, zstd displays progress notifica‐
33           tions and result summary by default. Use -q to turn them off.
34
35       ·   zstd does not accept input from console, but  it  properly  accepts
36           stdin when it´s not the console.
37
38       ·   zstd  displays a short help page when command line is an error. Use
39           -q to turn it off.
40
41
42
43       zstd compresses or decompresses each file  according  to  the  selected
44       operation  mode.  If  no  files are given or file is -, zstd reads from
45       standard input and writes the processed data to standard  output.  zstd
46       will refuse to write compressed data to standard output if it is a ter‐
47       minal : it will display an error message and skip the file.  Similarly,
48       zstd will refuse to read compressed data from standard input if it is a
49       terminal.
50
51       Unless --stdout or -o is specified, files are written  to  a  new  file
52       whose name is derived from the source file name:
53
54       ·   When  compressing,  the suffix .zst is appended to the source file‐
55           name to get the target filename.
56
57       ·   When decompressing, the .zst suffix  is  removed  from  the  source
58           filename to get the target filename
59
60
61
62   Concatenation with .zst files
63       It  is  possible  to concatenate .zst files as is. zstd will decompress
64       such files as if they were a single .zst file.
65

OPTIONS

67   Integer suffixes and special values
68       In most places where an integer argument is expected, an optional  suf‐
69       fix  is  supported  to easily indicate large integers. There must be no
70       space between the integer and the suffix.
71
72       KiB    Multiply the integer by 1,024 (2^10). Ki, K, and KB are accepted
73              as synonyms for KiB.
74
75       MiB    Multiply  the  integer  by  1,048,576  (2^20). Mi, M, and MB are
76              accepted as synonyms for MiB.
77
78   Operation mode
79       If multiple operation mode  options  are  given,  the  last  one  takes
80       effect.
81
82       -z, --compress
83              Compress.  This  is the default operation mode when no operation
84              mode option is specified and no other operation mode is  implied
85              from  the  command  name  (for  example, unzstd implies --decom‐
86              press).
87
88       -d, --decompress, --uncompress
89              Decompress.
90
91       -t, --test
92              Test the integrity of compressed files. This option  is  equiva‐
93              lent  to --decompress --stdout except that the decompressed data
94              is discarded instead of being written  to  standard  output.  No
95              files are created or removed.
96
97       -b#    Benchmark file(s) using compression level #
98
99       --train FILEs
100              Use FILEs as a training set to create a dictionary. The training
101              set should contain a lot of small files (> 100).
102
103       -l, --list
104              Display information related to a zstd compressed file,  such  as
105              size,  ratio,  and  checksum.  Some  of  these fields may not be
106              available. This command can be augmented with the -v modifier.
107
108   Operation modifiers
109       -#     # compression level [1-19] (default: 3)
110
111       --fast[=#]
112              switch to ultra-fast compression levels. If =# is  not  present,
113              it  defaults to 1. The higher the value, the faster the compres‐
114              sion speed, at the cost of some compression ratio. This  setting
115              overwrites  compression  level  if one was set previously. Simi‐
116              larly, if a compression level is set after --fast, it  overrides
117              it.
118
119       --ultra
120              unlocks  high  compression  levels 20+ (maximum 22), using a lot
121              more memory. Note that decompression will also require more mem‐
122              ory when using these levels.
123
124       --long[=#]
125              enables long distance matching with # windowLog, if not # is not
126              present it defaults to 27. This increases the window size  (win‐
127              dowLog)  and memory usage for both the compressor and decompres‐
128              sor. This setting is designed to improve the  compression  ratio
129              for files with long matches at a large distance.
130
131              Note: If windowLog is set to larger than 27, --long=windowLog or
132              --memory=windowSize needs to be passed to the decompressor.
133
134       -T#, --threads=#
135              Compress using # working  threads  (default:  1).  If  #  is  0,
136              attempt  to  detect and use the number of physical CPU cores. In
137              all   cases,   the   nb   of   threads   is   capped   to   ZST‐
138              DMT_NBTHREADS_MAX==200.  This  modifier  does nothing if zstd is
139              compiled without multithread support.
140
141       --single-thread
142              Does not spawn a thread for compression, use a single thread for
143              both  I/O  and compression. In this mode, compression is serial‐
144              ized with I/O, which is slightly slower. (This is different from
145              -T1, which spawns 1 compression thread in parallel of I/O). This
146              mode is the only one available when multithread support is  dis‐
147              abled.  Single-thread  mode  features  lower memory usage. Final
148              compressed result is slightly different from -T1.
149
150       --adapt[=min=#,max=#]
151              zstd will dynamically adapt compression level to  perceived  I/O
152              conditions. Compression level adaptation can be observed live by
153              using command -v. Adaptation can be constrained between supplied
154              min  and  max  levels.  The  feature  works  when  combined with
155              multi-threading and --long mode. It does not  work  with  --sin‐
156              gle-thread.  It  sets  window  size  to  8 MB by default (can be
157              changed manually, see  wlog).  Due  to  the  chaotic  nature  of
158              dynamic  adaptation, compressed result is not reproducible. note
159              : at the time of this writing, --adapt can remain stuck  at  low
160              speed when combined with multiple worker threads (>=2).
161
162       --rsyncable
163              zstd will periodically synchronize the compression state to make
164              the compressed file more rsync-friendly. There is  a  negligible
165              impact  to  compression ratio, and the faster compression levels
166              will see a small compression speed hit. This  feature  does  not
167              work  with  --single-thread.  You  probably don´t want to use it
168              with long range mode, since it will decrease  the  effectiveness
169              of the synchronization points, but your milage may vary.
170
171       -D file
172              use file as Dictionary to compress or decompress FILE(s)
173
174       --no-dictID
175              do  not store dictionary ID within frame header (dictionary com‐
176              pression). The decoder will have to rely on  implicit  knowledge
177              about which dictionary to use, it won´t be able to check if it´s
178              correct.
179
180       -o file
181              save result into file (only possible with a single INPUT-FILE)
182
183       -f, --force
184              overwrite output without prompting,  and  (de)compress  symbolic
185              links
186
187       -c, --stdout
188              force write to standard output, even if it is the console
189
190       --[no-]sparse
191              enable  /  disable  sparse  FS  support, to make files with many
192              zeroes smaller on disk. Creating  sparse  files  may  save  disk
193              space  and speed up decompression by reducing the amount of disk
194              I/O. default: enabled when output is into a file,  and  disabled
195              when  output  is  stdout. This setting overrides default and can
196              force sparse mode over stdout.
197
198       --rm   remove source file(s) after successful compression or decompres‐
199              sion
200
201       -k, --keep
202              keep  source  file(s) after successful compression or decompres‐
203              sion. This is the default behavior.
204
205       -r     operate recursively on directories
206
207       --format=FORMAT
208              compress and decompress in other formats. If compiled with  sup‐
209              port,  zstd can compress to or decompress from other compression
210              algorithm formats. Possibly available options  are  zstd,  gzip,
211              xz,  lzma,  and  lz4. If no such format is provided, zstd is the
212              default.
213
214       -h/-H, --help
215              display help/long help and exit
216
217       -V, --version
218              display version number and exit. Advanced :  -vV  also  displays
219              supported formats. -vvV also displays POSIX support.
220
221       -v     verbose mode
222
223       -q, --quiet
224              suppress  warnings,  interactivity,  and  notifications. specify
225              twice to suppress errors too.
226
227       -C, --[no-]check
228              add integrity check computed from  uncompressed  data  (default:
229              enabled)
230
231       --     All arguments after -- are treated as files
232
233

Parallel Zstd OPTIONS

235       Additional options for the pzstd utility
236
237       -p, --processes
238               number of threads to use for (de)compression (default:4)
239
240
241

DICTIONARY BUILDER

243       zstd  offers  dictionary compression, which greatly improves efficiency
244       on small files and messages. It´s possible to train zstd with a set  of
245       samples,  the result of which is saved into a file called a dictionary.
246       Then during compression and decompression, reference the  same  dictio‐
247       nary,  using  command -D dictionaryFileName. Compression of small files
248       similar to the sample set will be greatly improved.
249
250       --train FILEs
251              Use FILEs as training set to create a dictionary.  The  training
252              set should contain a lot of small files (> 100), and weight typ‐
253              ically 100x the target dictionary size (for example, 10 MB for a
254              100 KB dictionary).
255
256              Supports  multithreading if zstd is compiled with threading sup‐
257              port. Additional parameters can be specified with  --train-fast‐
258              cover.  The  legacy  dictionary  builder  can  be  accessed with
259              --train-legacy. The cover dictionary  builder  can  be  accessed
260              with --train-cover. Equivalent to --train-fastcover=d=8,steps=4.
261
262       -o file
263              Dictionary saved into file (default name: dictionary).
264
265       --maxdict=#
266              Limit dictionary to specified size (default: 112640).
267
268       -#     Use  # compression level during training (optional). Will gener‐
269              ate  statistics  more  tuned  for  selected  compression  level,
270              resulting  in  a  small  compression  ratio improvement for this
271              level.
272
273       -B#    Split input files in blocks of size # (default: no split)
274
275       --dictID=#
276              A dictionary ID is a locally unique ID that a decoder can use to
277              verify  it  is using the right dictionary. By default, zstd will
278              create a 4-bytes random number ID. It´s possible to give a  pre‐
279              cise  number  instead. Short numbers have an advantage : an ID <
280              256 will only need 1 byte in the compressed frame header, and an
281              ID  < 65536 will only need 2 bytes. This compares favorably to 4
282              bytes default. However, it´s up to the dictionary manager to not
283              assign twice the same ID to 2 different dictionaries.
284
285       --train-cover[=k#,d=#,steps=#,split=#]
286              Select  parameters  for the default dictionary builder algorithm
287              named cover. If d is not specified, then it tries d = 6 and d  =
288              8.  If  k  is  not  specified, then it tries steps values in the
289              range [50, 2000]. If steps is not specified,  then  the  default
290              value  of  40  is used. If split is not specified or split <= 0,
291              then the default value of 100 is used. Requires that d <= k.
292
293              Selects segments of size k with highest score to put in the dic‐
294              tionary.  The  score  of a segment is computed by the sum of the
295              frequencies of all the subsegments of size d. Generally d should
296              be in the range [6, 8], occasionally up to 16, but the algorithm
297              will run faster with d <= 8. Good values for k vary widely based
298              on  the  input data, but a safe range is [2 * d, 2000]. If split
299              is 100, all input samples are used for both training and testing
300              to  find  optimal  d  and k to build dictionary. Supports multi‐
301              threading if zstd is compiled with threading support.
302
303              Examples:
304
305              zstd --train-cover FILEs
306
307              zstd --train-cover=k=50,d=8 FILEs
308
309              zstd --train-cover=d=8,steps=500 FILEs
310
311              zstd --train-cover=k=50 FILEs
312
313              zstd --train-cover=k=50,split=60 FILEs
314
315       --train-fastcover[=k#,d=#,f=#,steps=#,split=#,accel=#]
316              Same as cover but with extra parameters f and accel and  differ‐
317              ent  default  value  of split If split is not specified, then it
318              tries split = 75. If f is not specified, then it tries f  =  20.
319              Requires  that  0  <  f < 32. If accel is not specified, then it
320              tries accel = 1. Requires that 0 < accel <= 10. Requires that  d
321              = 6 or d = 8.
322
323              f  is log of size of array that keeps track of frequency of sub‐
324              segments of size d. The subsegment is hashed to an index in  the
325              range  [0,2^f  - 1]. It is possible that 2 different subsegments
326              are hashed to the same index, and they  are  considered  as  the
327              same  subsegment  when  computing  frequency.  Using  a higher f
328              reduces collision but takes longer.
329
330              Examples:
331
332              zstd --train-fastcover FILEs
333
334              zstd --train-fastcover=d=8,f=15,accel=2 FILEs
335
336       --train-legacy[=selectivity=#]
337              Use legacy dictionary builder algorithm with the  given  dictio‐
338              nary  selectivity  (default:  9).  The  smaller  the selectivity
339              value, the denser the dictionary, improving its  efficiency  but
340              reducing  its  possible maximum size. --train-legacy=s=# is also
341              accepted.
342
343              Examples:
344
345              zstd --train-legacy FILEs
346
347              zstd --train-legacy=selectivity=8 FILEs
348

BENCHMARK

350       -b#    benchmark file(s) using compression level #
351
352       -e#    benchmark file(s) using multiple compression levels, from -b# to
353              -e# (inclusive)
354
355       -i#    minimum  evaluation  time,  in  seconds (default: 3s), benchmark
356              mode only
357
358       -B#, --block-size=#
359              cut file(s) into independent  blocks  of  size  #  (default:  no
360              block)
361
362       --priority=rt
363              set process priority to real-time
364
365       Output  Format:  CompressionLevel#Filename  :  IntputSize -> OutputSize
366       (CompressionRatio), CompressionSpeed, DecompressionSpeed
367
368       Methodology: For both compression and decompression speed,  the  entire
369       input  is  compressed/decompressed  in-memory  to  measure speed. A run
370       lasts at  least  1  sec,  so  when  files  are  small,  they  are  com‐
371       pressed/decompressed  several  times  per run, in order to improve mea‐
372       surement accuracy.
373

ADVANCED COMPRESSION OPTIONS

375   --zstd[=options]:
376       zstd provides 22 predefined compression levels. The selected or default
377       predefined  compression  level can be changed with advanced compression
378       options. The options are provided as a comma-separated  list.  You  may
379       specify  only the options you want to change and the rest will be taken
380       from the selected or default compression level. The list  of  available
381       options:
382
383       strategy=strat, strat=strat
384              Specify a strategy used by a match finder.
385
386              There  are  9  strategies  numbered  from 1 to 9, from faster to
387              stronger: 1=ZSTD_fast, 2=ZSTD_dfast, 3=ZSTD_greedy, 4=ZSTD_lazy,
388              5=ZSTD_lazy2,   6=ZSTD_btlazy2,   7=ZSTD_btopt,  8=ZSTD_btultra,
389              9=ZSTD_btultra2.
390
391       windowLog=wlog, wlog=wlog
392              Specify the maximum number of bits for a match distance.
393
394              The higher number of increases the chance to find a match  which
395              usually  improves  compression  ratio.  It also increases memory
396              requirements for the compressor and  decompressor.  The  minimum
397              wlog is 10 (1 KiB) and the maximum is 30 (1 GiB) on 32-bit plat‐
398              forms and 31 (2 GiB) on 64-bit platforms.
399
400              Note: If windowLog is set to larger than 27, --long=windowLog or
401              --memory=windowSize needs to be passed to the decompressor.
402
403       hashLog=hlog, hlog=hlog
404              Specify the maximum number of bits for a hash table.
405
406              Bigger  hash  tables  cause  less collisions which usually makes
407              compression faster, but requires more memory during compression.
408
409              The minimum hlog is 6 (64 B) and the maximum is 26 (128 MiB).
410
411       chainLog=clog, clog=clog
412              Specify the maximum number of bits for a hash chain or a  binary
413              tree.
414
415              Higher  numbers  of  bits  increases  the chance to find a match
416              which usually improves compression ratio.  It  also  slows  down
417              compression speed and increases memory requirements for compres‐
418              sion. This option is ignored for the ZSTD_fast strategy.
419
420              The minimum clog is 6 (64 B) and the maximum is 28 (256 MiB).
421
422       searchLog=slog, slog=slog
423              Specify the maximum number of searches in  a  hash  chain  or  a
424              binary tree using logarithmic scale.
425
426              More searches increases the chance to find a match which usually
427              increases compression ratio but decreases compression speed.
428
429              The minimum slog is 1 and the maximum is 26.
430
431       minMatch=mml, mml=mml
432              Specify the minimum searched length of a match in a hash table.
433
434              Larger search lengths usually  decrease  compression  ratio  but
435              improve decompression speed.
436
437              The minimum mml is 3 and the maximum is 7.
438
439       targetLen=tlen, tlen=tlen
440              The impact of this field vary depending on selected strategy.
441
442              For ZSTD_btopt, ZSTD_btultra and ZSTD_btultra2, it specifies the
443              minimum match length that causes match finder to stop searching.
444              A  larger  targetLen  usually  improves  compression  ratio  but
445              decreases compression speed.
446
447              For ZSTD_fast, it triggers ultra-fast mode when > 0.  The  value
448              represents  the  amount  of data skipped between match sampling.
449              Impact is reversed : a larger  targetLen  increases  compression
450              speed but decreases compression ratio.
451
452              For all other strategies, this field has no impact.
453
454              The minimum tlen is 0 and the maximum is 999.
455
456       overlapLog=ovlog, ovlog=ovlog
457              Determine  overlapSize,  amount  of  data reloaded from previous
458              job. This parameter is only  available  when  multithreading  is
459              enabled.  Reloading  more  data  improves compression ratio, but
460              decreases speed.
461
462              The minimum ovlog is 0, and the maximum is 9. 1 means "no  over‐
463              lap", hence completely independent jobs. 9 means "full overlap",
464              meaning up to windowSize is reloaded from previous job. Reducing
465              ovlog  by 1 reduces the reloaded amount by a factor 2. For exam‐
466              ple, 8 means "windowSize/2", and 6 means "windowSize/8". Value 0
467              is  special  and means "default" : ovlog is automatically deter‐
468              mined by zstd. In which case, ovlog will  range  from  6  to  9,
469              depending on selected strat.
470
471       ldmHashLog=lhlog, lhlog=lhlog
472              Specify the maximum size for a hash table used for long distance
473              matching.
474
475              This option is ignored unless long distance matching is enabled.
476
477              Bigger hash tables usually  improve  compression  ratio  at  the
478              expense of more memory during compression and a decrease in com‐
479              pression speed.
480
481              The minimum lhlog is 6 and the maximum is 26 (default: 20).
482
483       ldmMinMatch=lmml, lmml=lmml
484              Specify the minimum searched length of a match for long distance
485              matching.
486
487              This option is ignored unless long distance matching is enabled.
488
489              Larger/very small values usually decrease compression ratio.
490
491              The minimum lmml is 4 and the maximum is 4096 (default: 64).
492
493       ldmBucketSizeLog=lblog, lblog=lblog
494              Specify the size of each bucket for the hash table used for long
495              distance matching.
496
497              This option is ignored unless long distance matching is enabled.
498
499              Larger bucket sizes improve collision  resolution  but  decrease
500              compression speed.
501
502              The minimum lblog is 0 and the maximum is 8 (default: 3).
503
504       ldmHashRateLog=lhrlog, lhrlog=lhrlog
505              Specify  the  frequency  of inserting entries into the long dis‐
506              tance matching hash table.
507
508              This option is ignored unless long distance matching is enabled.
509
510              Larger values will improve compression speed. Deviating far from
511              the  default  value will likely result in a decrease in compres‐
512              sion ratio.
513
514              The default value is wlog - lhlog.
515
516   Example
517       The following parameters sets advanced compression options to something
518       similar to predefined level 19 for files bigger than 256 KB:
519
520       --zstd=wlog=23,clog=23,hlog=22,slog=6,mml=3,tlen=48,strat=6
521
522   -B#:
523       Select  the  size  of each compression job. This parameter is available
524       only when multi-threading is enabled. Default value is 4 *  windowSize,
525       which means it varies depending on compression level. -B# makes it pos‐
526       sible to select a custom value. Note that job size must respect a mini‐
527       mum value which is enforced transparently. This minimum is either 1 MB,
528       or overlapSize, whichever is largest.
529

BUGS

531       Report bugs at: https://github.com/facebook/zstd/issues
532

AUTHOR

534       Yann Collet
535
536
537
538zstd 1.3.8                       December 2018                         ZSTD(1)