sox(1) - f33

1SoX(1)                          Sound eXchange                          SoX(1)
2
3
4

NAME

6       SoX - Sound eXchange, the Swiss Army knife of audio manipulation
7

SYNOPSIS

9       sox [global-options] [format-options] infile1
10            [[format-options] infile2] ... [format-options] outfile
11            [effect [effect-options]] ...
12
13       play [global-options] [format-options] infile1
14            [[format-options] infile2] ... [format-options]
15            [effect [effect-options]] ...
16
17       rec [global-options] [format-options] outfile
18            [effect [effect-options]] ...
19

DESCRIPTION

21   Introduction
22       SoX  reads  and  writes  audio  files  in  most popular formats and can
23       optionally apply  effects  to  them.  It  can  combine  multiple  input
24       sources,  synthesise audio, and, on many systems, act as a general pur‐
25       pose audio player or a multi-track audio recorder. It also has  limited
26       ability to split the input into multiple output files.
27
28       All SoX functionality is available using just the sox command.  To sim‐
29       plify playing and recording audio, if SoX is invoked as play, the  out‐
30       put  file  is  automatically set to be the default sound device, and if
31       invoked as rec, the default sound device is used as  an  input  source.
32       Additionally,  the  soxi(1)  command  provides a convenient way to just
33       query audio file header information.
34
35       The heart of SoX is a  library  called  libSoX.   Those  interested  in
36       extending  SoX or using it in other programs should refer to the libSoX
37       manual page: libsox(3).
38
39       SoX is a command-line audio processing  tool,  particularly  suited  to
40       making  quick,  simple  edits  and to batch processing.  If you need an
41       interactive, graphical audio editor, use audacity(1).
42
43                                 *        *        *
44
45       The overall SoX processing chain can be summarised as follows:
46
47                      Input(s) → Combiner → Effects → Output(s)
48
49       Note however, that on the SoX command line, the positions of  the  Out‐
50       put(s)  and the Effects are swapped w.r.t. the logical flow just shown.
51       Note also that whilst options pertaining to  files  are  placed  before
52       their  respective file name, the opposite is true for effects.  To show
53       how this works in practice, here is a selection of examples of how  SoX
54       might be used.  The simple
55          sox recital.au recital.wav
56       translates  an  audio  file  in  Sun AU format to a Microsoft WAV file,
57       whilst
58          sox recital.au -b 16 recital.wav channels 1 rate 16k fade 3 norm
59       performs the same format translation, but  also  applies  four  effects
60       (down-mix  to  one channel, sample rate change, fade-in, nomalize), and
61       stores the result at a bit-depth of 16.
62          sox -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav
63       converts `raw' (a.k.a. `headerless') audio to  a  self-describing  file
64       format,
65          sox slow.aiff fixed.aiff speed 1.027
66       adjusts audio speed,
67          sox short.wav long.wav longer.wav
68       concatenates two audio files, and
69          sox -m music.mp3 voice.wav mixed.flac
70       mixes together two audio files.
71          play "The Moonbeams/Greatest/*.ogg" bass +3
72       plays  a  collection  of  audio  files  whilst applying a bass boosting
73       effect,
74          play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 1 0.1
75       plays a synthesised `A minor seventh' chord with a pipe-organ sound,
76          rec -c 2 radio.aiff trim 0 30:00
77       records half an hour of stereo audio, and
78          play -q take1.aiff & rec -M take1.aiff take1-dub.aiff
79       (with POSIX shell and where supported by hardware) records a new  track
80       in a multi-track recording.  Finally,
81          rec -r 44100 -b 16 -e signed-integer -p \
82            silence 1 0.50 0.1% 1 10:00 0.1% | \
83            sox -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
84            newfile : restart
85       records a stream of audio such as LP/cassette and splits in to multiple
86       audio files at points with 2 seconds of silence.   Also,  it  does  not
87       start  recording  until  it detects audio is playing and stops after it
88       sees 10 minutes of silence.
89
90       N.B.  The above is just an overview  of  SoX's  capabilities;  detailed
91       explanations  of  how  to  use  all  SoX  parameters, file formats, and
92       effects can be found below in this  manual,  in  soxformat(7),  and  in
93       soxi(1).
94
95   File Format Types
96       SoX  can  work  with  `self-describing'  and `raw' audio files.  `self-
97       describing' formats (e.g. WAV, FLAC, MP3) have a header that completely
98       describes  the  signal  and  encoding attributes of the audio data that
99       follows. `raw' or `headerless' formats do not contain this information,
100       so the audio characteristics of these must be described on the SoX com‐
101       mand line or inferred from those of the input file.
102
103       The following four characteristics are used to describe the  format  of
104       audio data such that it can be processed with SoX:
105
106       sample rate
107              The  sample rate in samples per second (`Hertz' or `Hz').  Digi‐
108              tal telephony  traditionally  uses  a  sample  rate  of  8000 Hz
109              (8 kHz), though these days, 16 and even 32 kHz are becoming more
110              common. Audio Compact Discs  use  44100 Hz  (44.1 kHz).  Digital
111              Audio  Tape  and  many computer systems use 48 kHz. Professional
112              audio systems often use 96 kHz.
113
114       sample size
115              The number of bits used to store each sample.  Today, 16-bit  is
116              commonly  used.  8-bit was popular in the early days of computer
117              audio. 24-bit is used in the  professional  audio  arena.  Other
118              sizes are also used.
119
120       data encoding
121              The   way   in  which  each  audio  sample  is  represented  (or
122              `encoded').  Some encodings have variants with  different  byte-
123              orderings  or  bit-orderings.   Some  compress the audio data so
124              that the stored audio data takes up less space (i.e. disk  space
125              or  transmission bandwidth) than the other format parameters and
126              the number of samples would imply.  Commonly-used encoding types
127              include  floating-point,  μ-law, ADPCM, signed-integer PCM, MP3,
128              and FLAC.
129
130       channels
131              The number  of  audio  channels  contained  in  the  file.   One
132              (`mono')  and  two (`stereo') are widely used.  `Surround sound'
133              audio typically contains six or more channels.
134
135       The term `bit-rate' is a measure of the amount of storage  occupied  by
136       an  encoded  audio signal over a unit of time.  It can depend on all of
137       the above and is typically denoted as a number of kilo-bits per  second
138       (kbps).   An  A-law  telephony  signal  has  a  bit-rate  of  64  kbps.
139       MP3-encoded stereo music typically has  a  bit-rate  of  128-196  kbps.
140       FLAC-encoded stereo music typically has a bit-rate of 550-760 kbps.
141
142       Most self-describing formats also allow textual `comments' to be embed‐
143       ded in the file that can be used to describe the  audio  in  some  way,
144       e.g. for music, the title, the author, etc.
145
146       One  important  use  of  audio file comments is to convey `Replay Gain'
147       information.  SoX supports applying Replay Gain information  (for  cer‐
148       tain input file formats only; currently, at least FLAC and Ogg Vorbis),
149       but not generating it.  Note that by default,  SoX  copies  input  file
150       comments  to  output  files  that support comments, so output files may
151       contain Replay Gain information if some was present in the input  file.
152       In  this  case,  if  anything other than a simple format conversion was
153       performed then the output file Replay Gain information is likely to  be
154       incorrect and so should be recalculated using a tool that supports this
155       (not SoX).
156
157       The soxi(1) command can be used to display information from audio  file
158       headers.
159
160   Determining & Setting The File Format
161       There  are  several mechanisms available for SoX to use to determine or
162       set the format characteristics of an audio file.  Depending on the cir‐
163       cumstances,  individual  characteristics may be determined or set using
164       different mechanisms.
165
166       To determine the format of an input file, SoX will  use,  in  order  of
167       precedence and as given or available:
168
169       1.  Command-line format options.
170
171       2.  The contents of the file header.
172
173       3.  The filename extension.
174
175       To set the output file format, SoX will use, in order of precedence and
176       as given or available:
177
178       1.  Command-line format options.
179
180       2.  The filename extension.
181
182       3.  The input file format characteristics, or the closest that is  sup‐
183           ported by the output file type.
184
185       For  all  files, SoX will exit with an error if the file type cannot be
186       determined. Command-line format options may need to be added or changed
187       to resolve the problem.
188
189   Playing & Recording Audio
190       The  play  and  rec  commands  are  provided  so that basic playing and
191       recording is as simple as
192          play existing-file.wav
193       and
194          rec new-file.wav
195       These two commands are functionally equivalent to
196          sox existing-file.wav -d
197       and
198          sox -d new-file.wav
199       Of course, further options and effects  (as  described  below)  can  be
200       added to the commands in either form.
201
202                                 *        *        *
203
204       Some  systems  provide  more  than  one  type of (SoX-compatible) audio
205       driver, e.g. ALSA & OSS, or SUNAU & AO.  Systems  can  also  have  more
206       than  one  audio  device (a.k.a. `sound card').  If more than one audio
207       driver has been built-in to SoX, and the default selected by  SoX  when
208       recording  or  playing  is  not the one that is wanted, then the AUDIO‐
209       DRIVER environment variable can be used to override the  default.   For
210       example (on many systems):
211          set AUDIODRIVER=oss
212          play ...
213       The  AUDIODEV  environment variable can be used to override the default
214       audio device, e.g.
215          set AUDIODEV=/dev/dsp2
216          play ...
217          sox ... -t oss
218       or
219          set AUDIODEV=hw:soundwave,1,2
220          play ...
221          sox ... -t alsa
222       Note that the way of setting environment variables varies  from  system
223       to system - for some specific examples, see `SOX_OPTS' below.
224
225       When  playing  a  file  with a sample rate that is not supported by the
226       audio output device, SoX will automatically invoke the rate  effect  to
227       perform  the  necessary sample rate conversion.  For compatibility with
228       old hardware, the default rate quality level is set to `low'. This  can
229       be  changed  by  explicitly specifying the rate effect with a different
230       quality level, e.g.
231          play ... rate -m
232       or by using the --play-rate-arg option (see below).
233
234                                 *        *        *
235
236       On some systems, SoX allows audio playback volume to be adjusted whilst
237       using play.  Where supported, this is achieved by tapping the `v' & `V'
238       keys during playback.
239
240       To help with setting a suitable recording level, SoX includes  a  peak-
241       level  meter  which can be invoked (before making the actual recording)
242       as follows:
243          rec -n
244       The recording level should be adjusted (using the system-provided mixer
245       program, not SoX) so that the meter is at most occasionally full scale,
246       and never `in the red' (an exclamation mark is  shown).   See  also  -S
247       below.
248
249   Accuracy
250       Many  file formats that compress audio discard some of the audio signal
251       information whilst doing so. Converting to such a format and then  con‐
252       verting  back  again  will  not  produce  an exact copy of the original
253       audio.  This is the case for many formats used in  telephony  (e.g.  A-
254       law,  GSM) where low signal bandwidth is more important than high audio
255       fidelity, and for many formats used in  portable  music  players  (e.g.
256       MP3,  Vorbis)  where  adequate  fidelity  can be retained even with the
257       large compression ratios that are needed to make portable players prac‐
258       tical.
259
260       Formats that discard audio signal information are called `lossy'.  For‐
261       mats that do not are called `lossless'.  The term `quality' is used  as
262       a  measure  of  how closely the original audio signal can be reproduced
263       when using a lossy format.
264
265       Audio file conversion with SoX is lossless when it can  be,  i.e.  when
266       not  using  lossy  compression,  when not reducing the sampling rate or
267       number of channels, and when the number of bits used in the destination
268       format is not less than in the source format.  E.g.  converting from an
269       8-bit PCM format to a 16-bit PCM format is lossless but converting from
270       an 8-bit PCM format to (8-bit) A-law isn't.
271
272       N.B.   SoX  converts all audio files to an internal uncompressed format
273       before performing any audio processing. This means that manipulating  a
274       file that is stored in a lossy format can cause further losses in audio
275       fidelity.  E.g. with
276          sox long.mp3 short.mp3 trim 10
277       SoX first decompresses the  input  MP3  file,  then  applies  the  trim
278       effect,  and  finally creates the output MP3 file by re-compressing the
279       audio - with a possible reduction in fidelity above that which occurred
280       when  the input file was created.  Hence, if what is ultimately desired
281       is lossily compressed audio, it is highly recommended  to  perform  all
282       audio  processing  using  lossless file formats and then convert to the
283       lossy format only at the final stage.
284
285       N.B.  Applying multiple effects with a single SoX invocation  will,  in
286       general, produce more accurate results than those produced using multi‐
287       ple SoX invocations.
288
289   Dithering
290       Dithering is a technique used to maximise the dynamic  range  of  audio
291       stored  at a particular bit-depth. Any distortion introduced by quanti‐
292       sation is decorrelated by adding a small amount of white noise  to  the
293       signal.  In most cases, SoX can determine whether the selected process‐
294       ing requires dither and will add it during output formatting if  appro‐
295       priate.
296
297       Specifically,  by  default, SoX automatically adds TPDF dither when the
298       output bit-depth is less than 24 and any of the following are true:
299
300       ·   bit-depth reduction has been specified explicitly using a  command-
301           line option
302
303       ·   the  output file format supports only bit-depths lower than that of
304           the input file format
305
306       ·   an effect has increased effective  bit-depth  within  the  internal
307           processing chain
308
309       For  example,  adjusting  volume  with vol 0.25 requires two additional
310       bits in which to losslessly  store  its  results  (since  0.25  decimal
311       equals  0.01 binary).  So if the input file bit-depth is 16, then SoX's
312       internal representation will utilise 18 bits after processing this vol‐
313       ume  change.   In  order  to  store the output at the same depth as the
314       input, dithering is used to remove the additional bits.
315
316       Use the -V option to see what processing SoX has  automatically  added.
317       The  -D option may be given to override automatic dithering.  To invoke
318       dithering manually (e.g. to select  a  noise-shaping  curve),  see  the
319       dither effect.
320
321   Clipping
322       Clipping is distortion that occurs when an audio signal level (or `vol‐
323       ume') exceeds the range of the chosen representation.  In  most  cases,
324       clipping  is  undesirable  and  so should be corrected by adjusting the
325       level prior to the point (in the processing chain) at which it occurs.
326
327       In SoX, clipping could occur, as you might expect, when using  the  vol
328       or gain effects to increase the audio volume. Clipping could also occur
329       with many other effects, when converting one  format  to  another,  and
330       even when simply playing the audio.
331
332       Playing an audio file often involves resampling, and processing by ana‐
333       logue components can introduce a small DC offset and/or  amplification,
334       all  of which can produce distortion if the audio signal level was ini‐
335       tially too close to the clipping point.
336
337       For these reasons, it is usual to make sure that an audio file's signal
338       level  has  some `headroom', i.e. it does not exceed a particular level
339       below the maximum possible level for the  given  representation.   Some
340       standards  bodies recommend as much as 9dB headroom, but in most cases,
341       3dB (≈ 70% linear) is enough.  Note that this wisdom seems to have been
342       lost in modern music production; in fact, many CDs, MP3s, etc.  are now
343       mastered at levels above 0dBFS i.e. the audio is clipped as delivered.
344
345       SoX's stat and stats effects can assist in determining the signal level
346       in  an  audio file. The gain or vol effect can be used to prevent clip‐
347       ping, e.g.
348          sox dull.wav bright.wav gain -6 treble +6
349       guarantees that the treble boost will not clip.
350
351       If clipping occurs at any point during processing, SoX will  display  a
352       warning message to that effect.
353
354       See also -G and the gain and norm effects.
355
356   Input File Combining
357       SoX's  input  combiner can be configured (see OPTIONS below) to combine
358       multiple files using  any  of  the  following  methods:  `concatenate',
359       `sequence',  `mix',  `mix-power',  `merge', or `multiply'.  The default
360       method is `sequence' for play, and `concatenate' for rec and sox.
361
362       For all methods other than `sequence', multiple input files  must  have
363       the  same  sampling rate. If necessary, separate SoX invocations can be
364       used to make sampling rate adjustments prior to combining.
365
366       If the `concatenate' combining method is selected (usually,  this  will
367       be  by  default) then the input files must also have the same number of
368       channels.  The audio from each input will be concatenated in the  order
369       given to form the output file.
370
371       The `sequence' combining method is selected automatically for play.  It
372       is similar to `concatenate' in that the audio from each input  file  is
373       sent  serially to the output file. However, here the output file may be
374       closed and reopened  at  the  corresponding  transition  between  input
375       files.  This may be just what is needed when sending different types of
376       audio to an output device, but is not generally useful when the  output
377       is a normal file.
378
379       If  either  the  `mix' or `mix-power' combining method is selected then
380       two or more input files must be given and will  be  mixed  together  to
381       form  the  output file.  The number of channels in each input file need
382       not be the same, but SoX will issue a warning if they are not and  some
383       channels  in  the  output  file will not contain audio from every input
384       file.  A mixed audio file cannot be un-mixed without reference  to  the
385       original input files.
386
387       If  the  `merge'  combining  method  is selected then two or more input
388       files must be given and will be merged  together  to  form  the  output
389       file.   The number of channels in each input file need not be the same.
390       A merged audio file comprises all of the channels from all of the input
391       files.  Un-merging  is  possible using multiple invocations of SoX with
392       the remix effect.  For example, two mono files could be merged to  form
393       one  stereo file. The first and second mono files would become the left
394       and right channels of the stereo file.
395
396       The `multiply' combining method multiplies the sample values of  corre‐
397       sponding  channels  (treated  as numbers in the interval -1 to +1).  If
398       the number of channels in the input files is not the same, the  missing
399       channels are considered to contain all zero.
400
401       When  combining input files, SoX applies any specified effects (includ‐
402       ing, for example, the vol volume adjustment effect) after the audio has
403       been combined. However, it is often useful to be able to set the volume
404       of (i.e. `balance') the inputs  individually,  before  combining  takes
405       place.
406
407       For  all  combining  methods, input file volume adjustments can be made
408       manually using the -v option (below) which can be given for one or more
409       input  files.  If it is given for only some of the input files then the
410       others receive no volume adjustment.  In some circumstances,  automatic
411       volume adjustments may be applied (see below).
412
413       The -V option (below) can be used to show the input file volume adjust‐
414       ments that have been selected (either manually or automatically).
415
416       There are some special considerations that need  to  made  when  mixing
417       input files:
418
419       Unlike  the  other  methods, `mix' combining has the potential to cause
420       clipping in the combiner if no balancing is performed.  In  this  case,
421       if manual volume adjustments are not given, SoX will try to ensure that
422       clipping does not occur by automatically adjusting the  volume  (ampli‐
423       tude) of each input signal by a factor of ¹/n, where n is the number of
424       input files.  If this results in audio that is too quiet  or  otherwise
425       unbalanced then the input file volumes can be set manually as described
426       above. Using the norm effect on the mix is another alternative.
427
428       If mixed audio seems loud enough at some points but too quiet in others
429       then  dynamic range compression should be applied to correct this - see
430       the compand effect.
431
432       With the `mix-power' combine method, the mixed volume is  approximately
433       equal to that of one of the input signals.  This is achieved by balanc‐
434       ing using a factor of ¹/√n instead of ¹/n.  Note  that  this  balancing
435       factor  does not guarantee that clipping will not occur, but the number
436       of clips will usually be low and the resultant distortion is  generally
437       imperceptible.
438
439   Output Files
440       SoX's  default  behaviour  is to take one or more input files and write
441       them to a single output file.
442
443       This behaviour can be changed by specifying the pseudo-effect `newfile'
444       within the effects list.  SoX will then enter multiple output mode.
445
446       In  multiple  output mode, a new file is created when the effects prior
447       to the `newfile' indicate they are  done.   The  effects  chain  listed
448       after  `newfile'  is then started up and its output is saved to the new
449       file.
450
451       In multiple output mode, a unique number will automatically be appended
452       to the end of all filenames.  If the filename has an extension then the
453       number is inserted before the extension.  This behaviour can be custom‐
454       ized  by  placing a %n anywhere in the filename where the number should
455       be substituted.  An optional number can be placed after the % to  indi‐
456       cate a minimum fixed width for the number.
457
458       Multiple output mode is not very useful unless an effect that will stop
459       the effects chain early is specified before the `newfile'.  If  end  of
460       file  is reached before the effects chain stops itself then no new file
461       will be created as it would be empty.
462
463       The following is an example of splitting the first  60  seconds  of  an
464       input file into two 30 second files and ignoring the rest.
465          sox song.wav ringtone%1n.wav trim 0 30 : newfile : trim 0 30
466
467   Stopping SoX
468       Usually SoX will complete its processing and exit automatically once it
469       has read all available audio data from the input files.
470
471       If desired, it can be terminated earlier by sending an interrupt signal
472       to the process (usually by pressing the keyboard interrupt key which is
473       normally Ctrl-C).  This is a natural requirement in some circumstances,
474       e.g.  when  using SoX to make a recording.  Note that when using SoX to
475       play multiple files, Ctrl-C behaves slightly differently:  pressing  it
476       once  causes  SoX  to skip to the next file; pressing it twice in quick
477       succession causes SoX to exit.
478
479       Another option to stop processing early is to use an effect that has  a
480       time  period  or sample count to determine the stopping point. The trim
481       effect is an example of this.  Once all  effects  chains  have  stopped
482       then SoX will also stop.
483

FILENAMES

485       Filenames can be simple file names, absolute or relative path names, or
486       URLs (input files only).  Note that URL support requires  that  wget(1)
487       is available.
488
489       Note:  Giving SoX an input or output filename that is the same as a SoX
490       effect-name will not  work  since  SoX  will  treat  it  as  an  effect
491       specification.    The  only  work-around  to  this  is  to  avoid  such
492       filenames. This is generally not difficult since most  audio  filenames
493       have a filename `extension', whilst effect-names do not.
494
495   Special Filenames
496       The following special filenames may be used in certain circumstances in
497       place of a normal filename on the command line:
498
499       -      SoX can be used in  simple  pipeline  operations  by  using  the
500              special  filename  `-' which, if used as an input filename, will
501              cause SoX will read audio data from  `standard  input'  (stdin),
502              and  which,  if used as the output filename, will cause SoX will
503              send audio data to `standard output' (stdout).  Note  that  when
504              using  this option for the output file, and sometimes when using
505              it for an input file, the file-type (see -t below) must also  be
506              given.
507
508       "|program [options] ..."
509              This  can  be  used in place of an input filename to specify the
510              the given program's standard output (stdout) be used as an input
511              file.   Unlike - (above), this can be used for several inputs to
512              one SoX command.  For example,  if  `genw'  generates  mono  WAV
513              formatted  signals  to  its  standard output, then the following
514              command makes a stereo file from two generated signals:
515                 sox -M "|genw --imd -" "|genw --thd -" out.wav
516              For  headerless  (raw)  audio,  -t  (and  perhaps  other  format
517              options) will need to be given, preceding the input command.
518
519       "wildcard-filename"
520              Specifies  that  filename `globbing' (wild-card matching) should
521              be performed by SoX instead of by the shell.  This allows a sin‐
522              gle  set of file options to be applied to a group of files.  For
523              example, if the current directory contains  three  `vox'  files,
524              file1.vox, file2.vox, and file3.vox, then
525                 play --rate 6k *.vox
526              will be expanded by the `shell' (in most environments) to
527                 play --rate 6k file1.vox file2.vox file3.vox
528              which will treat only the first vox file as having a sample rate
529              of 6k.  With
530                 play --rate 6k "*.vox"
531              the given sample rate option will be applied to  all  three  vox
532              files.
533
534       -p, --sox-pipe
535              This  can be used in place of an output filename to specify that
536              the SoX command should be used as in input pipe to  another  SoX
537              command.  For example, the command:
538                 play "|sox -n -p synth 2" "|sox -n -p synth 2 tremolo 10" stat
539              plays two `files' in succession, each with different effects.
540
541              -p is in fact an alias for `-t sox -'.
542
543       -d, --default-device
544              This  can  be  used  in  place of an input or output filename to
545              specify that the default audio device (if  one  has  been  built
546              into  SoX)  is to be used.  This is akin to invoking rec or play
547              (as described above).
548
549       -n, --null
550              This can be used in place of an  input  or  output  filename  to
551              specify that a `null file' is to be used.  Note that here, `null
552              file' refers to a SoX-specific mechanism and is not  related  to
553              any operating-system mechanism with a similar name.
554
555              Using a null file to input audio is equivalent to using a normal
556              audio file that contains an infinite amount of silence,  and  as
557              such  is  not  generally  useful unless used with an effect that
558              specifies a finite time length (such as trim or synth).
559
560              Using a null file to output  audio  amounts  to  discarding  the
561              audio and is useful mainly with effects that produce information
562              about the audio instead of affecting it (such  as  noiseprof  or
563              stat).
564
565              The  sampling  rate  associated  with  a null file is by default
566              48 kHz, but, as with a normal file, this can  be  overridden  if
567              desired using command-line format options (see below).
568
569   Supported File & Audio Device Types
570       See  soxformat(7) for a list and description of the supported file for‐
571       mats and audio device drivers.
572

OPTIONS

574   Global Options
575       These options can be specified on the command line at any point  before
576       the first effect name.
577
578       The  SOX_OPTS  environment  variable can be used to provide alternative
579       default values for SoX's global options.  For example:
580          SOX_OPTS="--buffer 20000 --play-rate-arg -hs --temp /mnt/temp"
581       Note that setting SOX_OPTS can potentially create unwanted  changes  in
582       the  behaviour  of scripts or other programs that invoke SoX.  SOX_OPTS
583       might best be used for things (such  as  in  the  given  example)  that
584       reflect  the  environment  in which SoX is being run.  Enabling options
585       such as --no-clobber as default might be handled better using  a  shell
586       alias since a shell alias will not affect operation in scripts etc.
587
588       One  way  to  ensure that a script cannot be affected by SOX_OPTS is to
589       clear SOX_OPTS at the start of the script, but this of course loses the
590       benefit  of  SOX_OPTS  carrying  some  system-wide default options.  An
591       alternative approach is to explicitly invoke SoX  with  default  option
592       values, e.g.
593          SOX_OPTS="-V --no-clobber"
594          ...
595          sox -V2 --clobber $input $output ...
596       Note  that  the  way to set environment variables varies from system to
597       system. Here are some examples:
598
599       Unix bash:
600          export SOX_OPTS="-V --no-clobber"
601       Unix csh:
602          setenv SOX_OPTS "-V --no-clobber"
603       MS-DOS/MS-Windows:
604          set SOX_OPTS=-V --no-clobber
605       MS-Windows GUI: via Control Panel : System  :  Advanced  :  Environment
606       Variables
607
608       Mac OS X GUI: Refer to Apple's Technical Q&A QA1067 document.
609
610       --buffer BYTES, --input-buffer BYTES
611              Set  the  size in bytes of the buffers used for processing audio
612              (default 8192).  --buffer applies to input, effects, and  output
613              processing; --input-buffer applies only to input processing (for
614              which it overrides --buffer if both are given).
615
616              Be aware that large values for --buffer will  cause  SoX  to  be
617              become  slow  to respond to requests to terminate or to skip the
618              current input file.
619
620       --clobber
621              Don't prompt before overwriting an existing file with  the  same
622              name as that given for the output file.  This is the default be‐
623              haviour.
624
625       --combine concatenate|merge|mix|mix-power|multiply|sequence
626              Select the input file combining method; for some of these, short
627              options are available: -m selects `mix', -M selects `merge', and
628              -T selects `multiply'.
629
630              See Input File Combining above for a description of the  differ‐
631              ent combining methods.
632
633       -D, --no-dither
634              Disable automatic dither - see `Dithering' above.  An example of
635              why this might occasionally be useful is if a file has been con‐
636              verted  from  16 to 24 bit with the intention of doing some pro‐
637              cessing on it, but in fact no processing is needed after all and
638              the original 16 bit file has been lost, then, strictly speaking,
639              no dither is needed if converting the file back to 16 bit.   See
640              also  the stats effect for how to determine the actual bit depth
641              of the audio within a file.
642
643       --effects-file FILENAME
644              Use FILENAME to obtain all effects  and  their  arguments.   The
645              file  is  parsed  as if the values were specified on the command
646              line.  A new line can be used in place of the special  :  marker
647              to separate effect chains.  For convenience, such markers at the
648              end of the file are normally ignored; if you want to specify  an
649              empty  last  effects  chain,  use an explicit : by itself on the
650              last line of the file.  This option causes any effects specified
651              on the command line to be discarded.
652
653       -G, --guard
654              Automatically  invoke the gain effect to guard against clipping.
655              E.g.
656                 sox -G infile -b 16 outfile rate 44100 dither -s
657              is shorthand for
658                 sox infile -b 16 outfile gain -h rate 44100 gain -rh dither -s
659              See also -V, --norm, and the gain effect.
660
661       -h, --help
662              Show version number and usage information.
663
664       --help-effect NAME
665              Show usage information on the specified effect.   The  name  all
666              can be used to show usage on all effects.
667
668       --help-format NAME
669              Show  information about the specified file format.  The name all
670              can be used to show information on all formats.
671
672       --i, --info
673              Only if given as the first parameter to sox, behave as soxi(1).
674
675       -m|-M  Equivalent to --combine mix and --combine merge, respectively.
676
677       --magic
678              If SoX has been built with the optional `libmagic' library  then
679              this  option can be given to enable its use in helping to detect
680              audio file types.
681
682       --multi-threaded | --single-threaded
683              By default, SoX is `single threaded'.  If  the  --multi-threaded
684              option is given however then SoX will process audio channels for
685              most multi-channel effects in parallel on hyper-threading/multi-
686              core  architectures.  This  may  reduce  processing time, though
687              sometimes it may be necessary to use this option in  conjunction
688              with  a larger buffer size than is the default to gain any bene‐
689              fit from multi-threaded processing (e.g.  131072;  see  --buffer
690              above).
691
692       --no-clobber
693              Prompt before overwriting an existing file with the same name as
694              that given for the output file.
695
696              N.B.  Unintentionally overwriting a  file  is  easier  than  you
697              might think, for example, if you accidentally enter
698                 sox file1 file2 effect1 effect2 ...
699              when what you really meant was
700                 play file1 file2 effect1 effect2 ...
701              then,  without  this  option, file2 will be overwritten.  Hence,
702              using this option is recommended. SOX_OPTS  (above),  a  `shell'
703              alias, script, or batch file may be an appropriate way of perma‐
704              nently enabling it.
705
706       --norm[=dB-level]
707              Automatically invoke the gain effect to guard  against  clipping
708              and to normalise the audio. E.g.
709                 sox --norm infile -b 16 outfile rate 44100 dither -s
710              is shorthand for
711                 sox infile -b 16 outfile gain -h rate 44100 gain -nh dither -s
712              Optionally,  the  audio can be normalized to a given level (usu‐
713              ally) below 0 dBFS:
714                 sox --norm=-3 infile outfile
715
716              See also -V, -G, and the gain effect.
717
718       --play-rate-arg ARG
719              Selects a quality option to be used when the  `rate'  effect  is
720              automatically invoked whilst playing audio.  This option is typ‐
721              ically set via the SOX_OPTS environment variable (see above).
722
723       --plot gnuplot|octave|off
724              If not set to off (the default if --plot is not given), run in a
725              mode  that  can be used, in conjunction with the gnuplot program
726              or the GNU Octave program, to assist with the selection and con‐
727              figuration  of many of the transfer-function based effects.  For
728              the first given effect that supports the selected plotting  pro‐
729              gram,  SoX  will  output  commands to plot the effect's transfer
730              function, and then exit without actually processing  any  audio.
731              E.g.
732                 sox --plot octave input-file -n highpass 1320 > highpass.plt
733                 octave highpass.plt
734
735       -q, --no-show-progress
736              Run  in  quiet  mode when SoX wouldn't otherwise do so.  This is
737              the opposite of the -S option.
738
739       -R     Run in `repeatable' mode.  When  this  option  is  given,  where
740              applicable, SoX will embed a fixed time-stamp in the output file
741              (e.g.  AIFF) and will `seed'  pseudo  random  number  generators
742              (e.g.   dither)  with a fixed number, thus ensuring that succes‐
743              sive SoX invocations with the same inputs and the  same  parame‐
744              ters yield the same output.
745
746       --replay-gain track|album|off
747              Select  whether  or not to apply replay-gain adjustment to input
748              files.  The default is off for sox and rec, album for play where
749              (at  least)  the  first two input files are tagged with the same
750              Artist and Album names, and track for play otherwise.
751
752       -S, --show-progress
753              Display input file  format/header  information,  and  processing
754              progress as input file(s) percentage complete, elapsed time, and
755              remaining time (if known; shown in brackets), and the number  of
756              samples  written to the output file.  Also shown is a peak-level
757              meter, and an indication if clipping has  occurred.   The  peak-
758              level meter shows up to two channels and is calibrated for digi‐
759              tal audio as follows (right channel shown):
760
761                            dB FSD   Display   dB FSD   Display
762                             -25     -          -11     ====
763                             -23     =           -9     ====-
764                             -21     =-          -7     =====
765                             -19     ==          -5     =====-
766                             -17     ==-         -3     ======
767                             -15     ===         -1     =====!
768                             -13     ===-
769
770              A three-second peak-held value of headroom in dBs will be  shown
771              to the right of the meter if this is below 6dB.
772
773              This  option  is  enabled  by  default when using SoX to play or
774              record audio.
775
776       -T     Equivalent to --combine multiply.
777
778       --temp DIRECTORY
779              Specify that any temporary files should be created in the  given
780              DIRECTORY.   This can be useful if there are permission or free-
781              space problems with the default location. In  this  case,  using
782              `--temp  .' (to use the current directory) is often a good solu‐
783              tion.
784
785       --version
786              Show SoX's version number and exit.
787
788       -V[level]
789              Set verbosity. This is particularly useful for  seeing  how  any
790              automatic effects have been invoked by SoX.
791
792              SoX  displays  messages on the console (stderr) according to the
793              following verbosity levels:
794
795              0      No messages are shown at all;  use  the  exit  status  to
796                     determine if an error has occurred.
797
798              1      Only  error  messages  are shown.  These are generated if
799                     SoX cannot complete the requested commands.
800
801              2      Warning messages are also shown.  These are generated  if
802                     SoX  can complete the requested commands, but not exactly
803                     according to the  requested  command  parameters,  or  if
804                     clipping occurs.
805
806              3      Descriptions  of  SoX's processing phases are also shown.
807                     Useful for seeing exactly  how  SoX  is  processing  your
808                     audio.
809
810              4 and above
811                     Messages to help with debugging SoX are also shown.
812
813              By  default,  the  verbosity level is set to 2 (shows errors and
814              warnings). Each occurrence of the -V option increases  the  ver‐
815              bosity  level  by  1.  Alternatively, the verbosity level can be
816              set to an absolute number by specifying it immediately after the
817              -V, e.g.  -V0 sets it to 0.
818
819   Input File Options
820       These  options  apply  only  to  input files and may precede only input
821       filenames on the command line.
822
823       --ignore-length
824              Override an (incorrect) audio length given in  an  audio  file's
825              header. If this option is given then SoX will keep reading audio
826              until it reaches the end of the input file.
827
828       -v, --volume FACTOR
829              Intended for use  when  combining  multiple  input  files,  this
830              option  adjusts  the  volume  of the file that follows it on the
831              command line by a factor of FACTOR. This allows it to  be  `bal‐
832              anced'  w.r.t.  the other input files.  This is a linear (ampli‐
833              tude) adjustment, so a number less than 1 decreases  the  volume
834              and  a number greater than 1 increases it.  If a negative number
835              is given then in addition to the volume  adjustment,  the  audio
836              signal will be inverted.
837
838              See  also  the  norm,  vol, and gain effects, and see Input File
839              Balancing above.
840
841   Input & Output File Format Options
842       These options apply to the input or output file whose name they immedi‐
843       ately precede on the command line and are used mainly when working with
844       headerless file formats or when specifying a format for the output file
845       that is different to that of the input file.
846
847       -b BITS, --bits BITS
848              The  number  of bits (a.k.a. bit-depth or sometimes word-length)
849              in each encoded sample.  Not  applicable  to  complex  encodings
850              such  as  MP3  or GSM.  Not necessary with encodings that have a
851              fixed number of bits, e.g.  A/μ-law, ADPCM.
852
853              For an input file, the most common use for  this  option  is  to
854              inform SoX of the number of bits per sample in a `raw' (`header‐
855              less') audio file.  For example
856                 sox -r 16k -e signed -b 8 input.raw output.wav
857              converts a particular `raw'  file  to  a  self-describing  `WAV'
858              file.
859
860              For  an output file, this option can be used (perhaps along with
861              -e) to set the output encoding size.  By default (i.e.  if  this
862              option  is  not given), the output encoding size will (providing
863              it is supported by the output file type) be  set  to  the  input
864              encoding size.  For example
865                 sox input.cdda -b 24 output.wav
866              converts  raw  CD  digital  audio  (16-bit, signed-integer) to a
867              24-bit (signed-integer) `WAV' file.
868
869       -c CHANNELS, --channels CHANNELS
870              The number of audio channels in the audio file. This can be  any
871              number greater than zero.
872
873              For  an  input  file,  the most common use for this option is to
874              inform SoX of the number of channels in a  `raw'  (`headerless')
875              audio  file.   Occasionally, it may be useful to use this option
876              with a `headered' file, in order  to  override  the  (presumably
877              incorrect)  value  in  the  header - note that this is only sup‐
878              ported with certain file types.  Examples:
879                 sox -r 48k -e float -b 32 -c 2 input.raw output.wav
880              converts a particular `raw'  file  to  a  self-describing  `WAV'
881              file.
882                 play -c 1 music.wav
883              interprets  the  file  data  as  belonging  to  a single channel
884              regardless of what is indicated in the file header.   Note  that
885              if  the file does in fact have two channels, this will result in
886              the file playing at half speed.
887
888              For an output file, this option provides a shorthand for  speci‐
889              fying  that  the  channels  effect should be invoked in order to
890              change (if necessary) the number of channels in the audio signal
891              to  the  number  given.  For example, the following two commands
892              are equivalent:
893                 sox input.wav -c 1 output.wav bass -b 24
894                 sox input.wav      output.wav bass -b 24 channels 1
895              though the second form is more flexible as it allows the effects
896              to be ordered arbitrarily.
897
898       -e ENCODING, --encoding ENCODING
899              The  audio encoding type.  Sometimes needed with file-types that
900              support more than one encoding type. For example, with raw, WAV,
901              or  AU  (but not, for example, with MP3 or FLAC).  The available
902              encoding types are as follows:
903
904              signed-integer
905                     PCM data stored as signed (`two's complement')  integers.
906                     Commonly  used  with  a  16  or 24 -bit encoding size.  A
907                     value of 0 represents minimum signal power.
908
909              unsigned-integer
910                     PCM data stored as unsigned integers.  Commonly used with
911                     an  8-bit encoding size.  A value of 0 represents maximum
912                     signal power.
913
914              floating-point
915                     PCM data stored as IEEE 753 single precision (32-bit)  or
916                     double  precision  (64-bit)  floating-point (`real') num‐
917                     bers.  A value of 0 represents minimum signal power.
918
919              a-law  International telephony standard for logarithmic encoding
920                     to  8  bits per sample.  It has a precision equivalent to
921                     roughly 13-bit PCM and is sometimes encoded with reversed
922                     bit-ordering (see the -X option).
923
924              u-law, mu-law
925                     North  American telephony standard for logarithmic encod‐
926                     ing to 8 bits per sample.  A.k.a. μ-law.  It has a preci‐
927                     sion  equivalent  to  roughly 14-bit PCM and is sometimes
928                     encoded with reversed bit-ordering (see the -X option).
929
930              oki-adpcm
931                     OKI (a.k.a. VOX, Dialogic, or Intel) 4-bit ADPCM; it  has
932                     a precision equivalent to roughly 12-bit PCM.  ADPCM is a
933                     form of audio compression  that  has  a  good  compromise
934                     between audio quality and encoding/decoding speed.
935
936              ima-adpcm
937                     IMA  (a.k.a. DVI) 4-bit ADPCM; it has a precision equiva‐
938                     lent to roughly 13-bit PCM.
939
940              ms-adpcm
941                     Microsoft 4-bit ADPCM; it has a precision  equivalent  to
942                     roughly 14-bit PCM.
943
944              gsm-full-rate
945                     GSM  is  currently  used  for  the  vast  majority of the
946                     world's digital wireless telephone  calls.   It  utilises
947                     several  audio formats with different bit-rates and asso‐
948                     ciated speech quality.  SoX has support for GSM's  origi‐
949                     nal  13kbps `Full Rate' audio format.  It is usually CPU-
950                     intensive to work with GSM audio.
951
952              Encoding names can  be  abbreviated  where  this  would  not  be
953              ambiguous; e.g. `unsigned-integer' can be given as `un', but not
954              `u' (ambiguous with `u-law').
955
956              For an input file, the most common use for  this  option  is  to
957              inform  SoX of the encoding of a `raw' (`headerless') audio file
958              (see the examples in -b and -c above).
959
960              For an output file, this option can be used (perhaps along  with
961              -b) to set the output encoding type  For example
962                 sox input.cdda -e float output1.wav
963
964                 sox input.cdda -b 64 -e float output2.wav
965              convert  raw CD digital audio (16-bit, signed-integer) to float‐
966              ing-point `WAV' files (single & double precision respectively).
967
968              By default (i.e. if this option is not given), the output encod‐
969              ing  type  will  (providing  it  is supported by the output file
970              type) be set to the input encoding type.
971
972       --no-glob
973              Specifies that filename `globbing' (wild-card  matching)  should
974              not be performed by SoX on the following filename.  For example,
975              if the current  directory  contains  the  two  files  `five-sec‐
976              onds.wav' and `five*.wav', then
977                 play --no-glob "five*.wav"
978              can be used to play just the single file `five*.wav'.
979
980       -r, --rate RATE[k]
981              Gives the sample rate in Hz (or kHz if appended with `k') of the
982              file.
983
984              For an input file, the most common use for  this  option  is  to
985              inform  SoX  of  the sample rate of a `raw' (`headerless') audio
986              file (see the examples in -b and -c above).  Occasionally it may
987              be useful to use this option with a `headered' file, in order to
988              override the (presumably incorrect) value in the header  -  note
989              that  this is only supported with certain file types.  For exam‐
990              ple, if audio was recorded with a sample-rate of say 48k from  a
991              source that played back a little, say 1.5%, too slowly, then
992                 sox -r 48720 input.wav output.wav
993              effectively  corrects the speed by changing only the file header
994              (but see also the speed effect for the more  usual  solution  to
995              this problem).
996
997              For  an output file, this option provides a shorthand for speci‐
998              fying that the rate effect should be invoked in order to  change
999              (if  necessary) the sample rate of the audio signal to the given
1000              value.  For example, the following two commands are equivalent:
1001                 sox input.wav -r 48k output.wav bass -b 24
1002                 sox input.wav        output.wav bass -b 24 rate 48k
1003              though the second form  is  more  flexible  as  it  allows  rate
1004              options  to be given, and allows the effects to be ordered arbi‐
1005              trarily.
1006
1007       -t, --type FILE-TYPE
1008              Gives the type of the audio file.  For  both  input  and  output
1009              files,  this option is commonly used to inform SoX of the type a
1010              `headerless' audio file (e.g. raw, mp3) where the actual/desired
1011              type  cannot be determined from a given filename extension.  For
1012              example:
1013                 another-command | sox -t mp3 - output.wav
1014
1015                 sox input.wav -t raw output.bin
1016              It can also be used to override the type  implied  by  an  input
1017              filename  extension,  but  if  overriding with a type that has a
1018              header, SoX will exit with an appropriate error message if  such
1019              a header is not actually present.
1020
1021              See soxformat(7) for a list of supported file types.
1022
1023       -L, --endian little
1024       -B, --endian big
1025       -x, --endian swap
1026              These  options  specify whether the byte-order of the audio data
1027              is, respectively, `little endian', `big endian', or the opposite
1028              to  that  of  the system on which SoX is being used.  Endianness
1029              applies only to data encoded as floating-point, or as signed  or
1030              unsigned  integers of 16 or more bits.  It is often necessary to
1031              specify one of these options for headerless files, and sometimes
1032              necessary   for  (otherwise)  self-describing  files.   A  given
1033              endian-setting option may be ignored for  an  input  file  whose
1034              header contains a specific endianness identifier, or for an out‐
1035              put file that is actually an audio device.
1036
1037              N.B.  Unlike other format characteristics, the endianness (byte,
1038              nibble,  &  bit ordering) of the input file is not automatically
1039              used for the output file; so, for example, when the following is
1040              run on a little-endian system:
1041                 sox -B audio.s16 trimmed.s16 trim 2
1042              trimmed.s16 will be created as little-endian;
1043                 sox -B audio.s16 -B trimmed.s16 trim 2
1044              must be used to preserve big-endianness in the output file.
1045
1046              The -V option can be used to check the selected orderings.
1047
1048       -N, --reverse-nibbles
1049              Specifies that the nibble ordering (i.e. the 2 halves of a byte)
1050              of the samples should be reversed; sometimes useful with  ADPCM-
1051              based formats.
1052
1053              N.B.  See also N.B. in section on -x above.
1054
1055       -X, --reverse-bits
1056              Specifies  that  the  bit  ordering  of  the  samples  should be
1057              reversed; sometimes useful with a few (mostly  headerless)  for‐
1058              mats.
1059
1060              N.B.  See also N.B. in section on -x above.
1061
1062   Output File Format Options
1063       These  options  apply  only to the output file and may precede only the
1064       output filename on the command line.
1065
1066       --add-comment TEXT
1067              Append a comment in the output file header (where applicable).
1068
1069       --comment TEXT
1070              Specify the comment text to store  in  the  output  file  header
1071              (where applicable).
1072
1073              SoX  will  provide  a  default comment if this option (or --com‐
1074              ment-file) is not given. To specify that no  comment  should  be
1075              stored in the output file, use --comment "" .
1076
1077       --comment-file FILENAME
1078              Specify  a file containing the comment text to store in the out‐
1079              put file header (where applicable).
1080
1081       -C, --compression FACTOR
1082              The compression factor for variably compressing output file for‐
1083              mats.   If  this  option is not given then a default compression
1084              factor will apply.  The compression factor is  interpreted  dif‐
1085              ferently  for  different  compressing  file  formats.   See  the
1086              description of the file formats that use this option in  soxfor‐
1087              mat(7) for more information.
1088

EFFECTS

1090       In  addition  to converting, playing and recording audio files, SoX can
1091       be used to invoke a number of audio `effects'.  Multiple effects may be
1092       applied by specifying them one after another at the end of the SoX com‐
1093       mand line, forming an `effects chain'.   Note  that  applying  multiple
1094       effects  in  real-time (i.e. when playing audio) is likely to require a
1095       high performance computer. Stopping other  applications  may  alleviate
1096       performance issues should they occur.
1097
1098       Some  of the SoX effects are primarily intended to be applied to a sin‐
1099       gle instrument or `voice'.  To facilitate this, the  remix  effect  and
1100       the  global  SoX option -M can be used to isolate then recombine tracks
1101       from a multi-track recording.
1102
1103   Multiple Effects Chains
1104       A single effects chain is made up of one or more effects.   Audio  from
1105       the input runs through the chain until either the end of the input file
1106       is reached or an effect in the chain requests to terminate the chain.
1107
1108       SoX supports running multiple effects chains over the input audio.   In
1109       this  case,  when  one chain indicates it is done processing audio, the
1110       audio data is then sent through the next effects chain.  This continues
1111       until  either no more effects chains exist or the input has reached the
1112       end of the file.
1113
1114       An effects chain is terminated by placing a : (colon) after an  effect.
1115       Any following effects are a part of a new effects chain.
1116
1117       It  is  important  to  place the effect that will stop the chain as the
1118       first effect in the chain.   This  is  because  any  samples  that  are
1119       buffered  by effects to the left of the terminating effect will be dis‐
1120       carded.  The amount of samples discarded is  related  to  the  --buffer
1121       option and it should be kept small, relative to the sample rate, if the
1122       terminating effect cannot be first.  Further  information  on  stopping
1123       effects can be found in the Stopping SoX section.
1124
1125       There  are a few pseudo-effects that aid using multiple effects chains.
1126       These include newfile which will start writing to  a  new  output  file
1127       before  moving  to  the  next effects chain and restart which will move
1128       back to the first effects chain.  Pseudo-effects must be  specified  as
1129       the  first  effect  in  a chain and as the only effect in a chain (they
1130       must have a : before and after they are specified).
1131
1132       The following is an example of multiple effects chains.  It will  split
1133       the  input file into multiple files of 30 seconds in length.  Each out‐
1134       put filename will have unique number in its name as documented  in  the
1135       Output Files section.
1136          sox infile.wav output.wav trim 0 30 : newfile : restart
1137
1138   Common Notation And Parameters
1139       In the descriptions that follow, brackets [ ] are used to denote param‐
1140       eters that are optional, braces { }  to  denote  those  that  are  both
1141       optional  and  repeatable,  and angle brackets < > to denote those that
1142       are repeatable but not optional.  Where applicable, default values  for
1143       optional parameters are shown in parenthesis ( ).
1144
1145       The  following parameters are used with, and have the same meaning for,
1146       several effects:
1147
1148       center[k]
1149              See frequency.
1150
1151       frequency[k]
1152              A frequency in Hz, or, if appended with `k', kHz.
1153
1154       gain   A power gain in dB.  Zero gives no gain; less than zero gives an
1155              attenuation.
1156
1157       position
1158              A  position  within the audio stream; the syntax is [=|+|-]time‐
1159              spec, where timespec is a time specification (see  below).   The
1160              optional first character indicates whether the timespec is to be
1161              interpreted relative to the start (=) or end (-) of audio, or to
1162              the  previous  position  if the effect accepts multiple position
1163              arguments (+).  The audio length must be known for  end-relative
1164              locations  to  work; some effects do accept -0 for end-of-audio,
1165              though, even if the length is unknown.  Which of =, +, - is  the
1166              default  depends  on  the  effect and is shown in its syntax as,
1167              e.g., position(+).
1168
1169              Examples: =2:00 (two minutes into the audio stream), -100s  (one
1170              hundred samples before the end of audio), +0:12+10s (twelve sec‐
1171              onds and ten samples after the previous position), -0.5+1s  (one
1172              sample less than half a second before the end of audio).
1173
1174       width[h|k|o|q]
1175              Used to specify the band-width of a filter.  A number of differ‐
1176              ent methods to specify the width are available (though  not  all
1177              for  every effect).  One of the characters shown may be appended
1178              to select the desired method as follows:
1179
1180                                        Method    Notes
1181                                   h      Hz
1182                                   k     kHz
1183                                   o   Octaves
1184                                   q   Q-factor   See [2]
1185
1186              For each effect that uses this  parameter,  the  default  method
1187              (i.e.  if  no  character  is appended) is the one that it listed
1188              first in the first line of the effect's description.
1189
1190       Most effects that expect an audio position or duration in a  parameter,
1191       i.e. a time specification, accept either of the following two forms:
1192
1193       [[hours:]minutes:]seconds[.frac][t]
1194              A  specification  of  `1:30.5' corresponds to one minute, thirty
1195              and ½ seconds.  The t suffix is entirely optional (however,  see
1196              the  silence  effect for an exception).  Note that the component
1197              values do not have to be normalized; e.g.,  `1:23:45',  `83:45',
1198              `79:0285',  `1:0:1425',  `1::1425'  and `5025' all are legal and
1199              equivalent to each other.
1200
1201       sampless
1202              Specifies the number of samples directly, as  in  `8000s'.   For
1203              large  sample  counts,  e notation is supported: `1.7e6s' is the
1204              same as `1700000s'.
1205
1206       Time specifications can also be chained with + or -  into  a  new  time
1207       specification  where  the right part is added to or subtracted from the
1208       left, respectively: `3:00-200s' means two  hundred  samples  less  than
1209       three minutes.
1210
1211       To see if SoX has support for an optional effect, enter sox -h and look
1212       for its name under the list: `EFFECTS'.
1213
1214   Supported Effects
1215       Note: a categorised list of the effects can be found in the  accompany‐
1216       ing `README' file.
1217
1218       allpass frequency[k] width[h|k|o|q]
1219              Apply  a two-pole all-pass filter with central frequency (in Hz)
1220              frequency, and filter-width width.  An all-pass  filter  changes
1221              the audio's frequency to phase relationship without changing its
1222              frequency to amplitude relationship.  The filter is described in
1223              detail in [1].
1224
1225              This effect supports the --plot global option.
1226
1227       band [-n] center[k] [width[h|k|o|q]]
1228              Apply  a  band-pass  filter.  The frequency response drops loga‐
1229              rithmically around the center frequency.   The  width  parameter
1230              gives  the slope of the drop.  The frequencies at center + width
1231              and center - width will be half of  their  original  amplitudes.
1232              band  defaults  to a mode oriented to pitched audio, i.e. voice,
1233              singing, or instrumental music.  The -n (for noise) option  uses
1234              the  alternate  mode  for  un-pitched  audio  (e.g. percussion).
1235              Warning: -n introduces a power-gain of about 11dB in the filter,
1236              so  beware  of  output  clipping.   band introduces noise in the
1237              shape of the filter, i.e. peaking at the  center  frequency  and
1238              settling around it.
1239
1240              This effect supports the --plot global option.
1241
1242              See also sinc for a bandpass filter with steeper shoulders.
1243
1244       bandpass|bandreject [-c] frequency[k] width[h|k|o|q]
1245              Apply  a  two-pole  Butterworth  band-pass or band-reject filter
1246              with central frequency  frequency,  and  (3dB-point)  band-width
1247              width.   The  -c  option  applies only to bandpass and selects a
1248              constant skirt gain (peak gain = Q) instead of the default: con‐
1249              stant  0dB  peak  gain.   The filters roll off at 6dB per octave
1250              (20dB per decade) and are described in detail in [1].
1251
1252              These effects support the --plot global option.
1253
1254              See also sinc for a bandpass filter with steeper shoulders.
1255
1256       bandreject frequency[k] width[h|k|o|q]
1257              Apply a band-reject filter.  See the description of the bandpass
1258              effect for details.
1259
1260       bass|treble gain [frequency[k] [width[s|h|k|o|q]]]
1261              Boost  or  cut the bass (lower) or treble (upper) frequencies of
1262              the audio using a two-pole shelving filter with a response simi‐
1263              lar  to  that of a standard hi-fi's tone-controls.  This is also
1264              known as shelving equalisation (EQ).
1265
1266              gain gives the gain at 0 Hz (for  bass),  or  whichever  is  the
1267              lower  of  ∼22 kHz  and the Nyquist frequency (for treble).  Its
1268              useful range is about -20 (for a large cut) to +20 (for a  large
1269              boost).  Beware of Clipping when using a positive gain.
1270
1271              If  desired,  the  filter  can be fine-tuned using the following
1272              optional parameters:
1273
1274              frequency sets the filter's central frequency and so can be used
1275              to  extend  or  reduce the frequency range to be boosted or cut.
1276              The default value is 100 Hz (for bass) or 3 kHz (for treble).
1277
1278              width determines how steep is the filter's shelf transition.  In
1279              addition  to  the  common  width specification methods described
1280              above, `slope' (the default, or if appended  with  `s')  may  be
1281              used.   The  useful  range of `slope' is about 0.3, for a gentle
1282              slope, to 1 (the maximum), for a steep slope; the default  value
1283              is 0.5.
1284
1285              The filters are described in detail in [1].
1286
1287              These effects support the --plot global option.
1288
1289              See also equalizer for a peaking equalisation effect.
1290
1291       bend   [-f   [22mframe-rate(25)]   [-o   [22mover-sample(16)]   {   start-posi‐
1292       tion(+),cents,end-position(+) }
1293              Changes pitch by specified amounts  at  specified  times.   Each
1294              given  triple:  start-position,cents,end-position  specifies one
1295              bend.  cents is the number of cents (100 cents = 1 semitone)  by
1296              which  to bend the pitch. The other values specify the points in
1297              time at which to start and end bending the pitch, respectively.
1298
1299              The pitch-bending algorithm utilises the Discrete Fourier Trans‐
1300              form  (DFT)  at  a particular frame rate and over-sampling rate.
1301              The -f and -o parameters may be used to adjust these  parameters
1302              and thus control the smoothness of the changes in pitch.
1303
1304              For  example,  an  initial  tone  is  generated, then bent three
1305              times, yielding four different notes in total:
1306                 play -n synth 2.5 sin 667 gain 1 \
1307                   bend .35,180,.25  .15,740,.53  0,-520,.3
1308              Here, the first bend runs from 0.35 to 0.6, and the  second  one
1309              from  0.75 to 1.28 seconds.  Note that the clipping that is pro‐
1310              duced in this example is deliberate; to remove it,  use  gain -5
1311              in place of gain 1.
1312
1313              See also pitch.
1314
1315       biquad b0 b1 b2 a0 a1 a2
1316              Apply  a biquad IIR filter with the given coefficients. Where b*
1317              and a* are the numerator and  denominator  coefficients  respec‐
1318              tively.
1319
1320              See http://en.wikipedia.org/wiki/Digital_biquad_filter (where a0
1321              = 1).
1322
1323              This effect supports the --plot global option.
1324
1325       channels CHANNELS
1326              Invoke a simple algorithm to change the number  of  channels  in
1327              the  audio  signal  to  the  given  number  CHANNELS:  mixing if
1328              decreasing the number of channels or duplicating  if  increasing
1329              the number of channels.
1330
1331              The  channels effect is invoked automatically if SoX's -c option
1332              specifies a number of channels that is different to that of  the
1333              input  file(s).   Alternatively, if this effect is given explic‐
1334              itly, then SoX's -c option need not be given.  For example,  the
1335              following two commands are equivalent:
1336                 sox input.wav -c 1 output.wav bass -b 24
1337                 sox input.wav      output.wav bass -b 24 channels 1
1338              though the second form is more flexible as it allows the effects
1339              to be ordered arbitrarily.
1340
1341              See also  remix  for  an  effect  that  allows  channels  to  be
1342              mixed/selected arbitrarily.
1343
1344       chorus gain-in gain-out <delay decay speed depth -s|-t>
1345              Add  a chorus effect to the audio.  This can make a single vocal
1346              sound like a chorus, but can also be applied to instrumentation.
1347
1348              Chorus resembles an echo effect with a short delay, but  whereas
1349              with echo the delay is constant, with chorus, it is varied using
1350              sinusoidal  or  triangular  modulation.   The  modulation  depth
1351              defines  the range the modulated delay is played before or after
1352              the delay. Hence the delayed sound will sound slower or  faster,
1353              that is the delayed sound tuned around the original one, like in
1354              a chorus where some vocals are slightly off key.   See  [3]  for
1355              more discussion of the chorus effect.
1356
1357              Each  four-tuple  parameter  delay/decay/speed/depth  gives  the
1358              delay in milliseconds and the decay (relative to gain-in) with a
1359              modulation speed in Hz using depth in milliseconds.  The modula‐
1360              tion is either sinusoidal (-s) or triangular (-t).  Gain-out  is
1361              the volume of the output.
1362
1363              A  typical delay is around 40ms to 60ms; the modulation speed is
1364              best near 0.25Hz and the modulation depth around 2ms.  For exam‐
1365              ple, a single delay:
1366                 play guitar1.wav chorus 0.7 0.9 55 0.4 0.25 2 -t
1367              Two delays of the original samples:
1368                 play guitar1.wav chorus 0.6 0.9 50 0.4 0.25 2 -t \
1369                    60 0.32 0.4 1.3 -s
1370              A fuller sounding chorus (with three additional delays):
1371                 play guitar1.wav chorus 0.5 0.9 50 0.4 0.25 2 -t \
1372                    60 0.32 0.4 2.3 -t 40 0.3 0.3 1.3 -s
1373
1374       compand attack1,decay1{,attack2,decay2}
1375              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB2,out-dB2}
1376              [gain [initial-volume-dB [delay]]]
1377
1378              Compand (compress or expand) the dynamic range of the audio.
1379
1380              The  attack and decay parameters (in seconds) determine the time
1381              over which the instantaneous level of the input signal is  aver‐
1382              aged to determine its volume; attacks refer to increases in vol‐
1383              ume and decays refer to decreases.   For  most  situations,  the
1384              attack  time  (response  to  the music getting louder) should be
1385              shorter than the decay time because the human ear is more sensi‐
1386              tive  to  sudden  loud music than sudden soft music.  Where more
1387              than one pair of attack/decay  parameters  are  specified,  each
1388              input  channel  is  companded separately and the number of pairs
1389              must agree with the number of input  channels.   Typical  values
1390              are 0.3,0.8 seconds.
1391
1392              The  second  parameter  is  a  list of points on the compander's
1393              transfer function specified in dB relative to the maximum possi‐
1394              ble  signal  amplitude.   The input values must be in a strictly
1395              increasing order but the transfer function does not have  to  be
1396              monotonically rising.  If omitted, the value of out-dB1 defaults
1397              to the same value as in-dB1; levels below in-dB1  are  not  com‐
1398              panded  (but  may  have gain applied to them).  The point 0,0 is
1399              assumed but may be overridden (by 0,out-dBn).  If  the  list  is
1400              preceded by a soft-knee-dB value, then the points at where adja‐
1401              cent line segments on the transfer function meet will be rounded
1402              by  the  amount given.  Typical values for the transfer function
1403              are 6:-70,-60,-20.
1404
1405              The third (optional) parameter is an additional gain in dB to be
1406              applied  at  all points on the transfer function and allows easy
1407              adjustment of the overall gain.
1408
1409              The fourth (optional)  parameter  is  an  initial  level  to  be
1410              assumed  for  each channel when companding starts.  This permits
1411              the user to supply a nominal level initially, so that, for exam‐
1412              ple,  a  very large gain is not applied to initial signal levels
1413              before the companding action has begun to operate: it  is  quite
1414              probable  that  in  such  an event, the output would be severely
1415              clipped while the compander gain  properly  adjusts  itself.   A
1416              typical value (for audio which is initially quiet) is -90 dB.
1417
1418              The fifth (optional) parameter is a delay in seconds.  The input
1419              signal is analysed immediately to control the compander, but  it
1420              is  delayed before being fed to the volume adjuster.  Specifying
1421              a delay approximately equal to the attack/decay times allows the
1422              compander to effectively operate in a `predictive' rather than a
1423              reactive mode.  A typical value is 0.2 seconds.
1424
1425                                    *        *        *
1426
1427              The following example might be used to make  a  piece  of  music
1428              with both quiet and loud passages suitable for listening to in a
1429              noisy environment such as a moving vehicle:
1430                 sox asz.wav asz-car.wav compand 0.3,1 6:-70,-60,-20 -5 -90 0.2
1431              The transfer function (`6:-70,...') says that very  soft  sounds
1432              (below -70dB) will remain unchanged.  This will stop the compan‐
1433              der from boosting  the  volume  on  `silent'  passages  such  as
1434              between  movements.   However,  sounds in the range -60dB to 0dB
1435              (maximum volume) will be boosted so that the 60dB dynamic  range
1436              of  the  original  music  will  be compressed 3-to-1 into a 20dB
1437              range, which is wide enough to enjoy the music but narrow enough
1438              to  get  around  the road noise.  The `6:' selects 6dB soft-knee
1439              companding.  The -5 (dB) output gain is needed to avoid clipping
1440              (the  number  is  inexact,  and was derived by experimentation).
1441              The -90 (dB) for the initial volume will work fine  for  a  clip
1442              that  starts  with  near silence, and the delay of 0.2 (seconds)
1443              has the effect of causing the compander  to  react  a  bit  more
1444              quickly to sudden volume changes.
1445
1446              In  the  next example, compand is being used as a noise-gate for
1447              when the noise is at a lower level than the signal:
1448                 play infile compand .1,.2 -inf,-50.1,-inf,-50,-50 0 -90 .1
1449              Here is another noise-gate, this time for when the noise is at a
1450              higher  level  than the signal (making it, in some ways, similar
1451              to squelch):
1452                 play infile compand .1,.1 -45.1,-45,-inf,0,-inf 45 -90 .1
1453              This effect supports the --plot global option (for the  transfer
1454              function).
1455
1456              See also mcompand for a multiple-band companding effect.
1457
1458       contrast [enhancement-amount(75)]
1459              Comparable  with compression, this effect modifies an audio sig‐
1460              nal to make it sound louder.   enhancement-amount  controls  the
1461              amount  of  the  enhancement and is a number in the range 0-100.
1462              Note that enhancement-amount = 0 still gives a significant  con‐
1463              trast enhancement.
1464
1465              See also the compand and mcompand effects.
1466
1467       dcshift shift [limitergain]
1468              Apply  a  DC shift to the audio.  This can be useful to remove a
1469              DC offset (caused perhaps by a hardware problem in the recording
1470              chain)  from  the  audio.   The effect of a DC offset is reduced
1471              headroom and hence volume.  The stat or stats effect can be used
1472              to determine if a signal has a DC offset.
1473
1474              The  given dcshift value is a floating point number in the range
1475              of ±2 that indicates the amount to shift the audio (which is  in
1476              the range of ±1).
1477
1478              An  optional  limitergain  can  be specified as well.  It should
1479              have a value much less than 1 (e.g. 0.05 or 0.02)  and  is  used
1480              only on peaks to prevent clipping.
1481
1482                                    *        *        *
1483
1484              An  alternative  approach to removing a DC offset (albeit with a
1485              short delay) is to use the highpass filter effect at a frequency
1486              of say 10Hz, as illustrated in the following example:
1487                 sox -n dc.wav synth 5 sin %0 50
1488                 sox dc.wav fixed.wav highpass 10
1489
1490       deemph Apply Compact Disc (IEC 60908) de-emphasis (a treble attenuation
1491              shelving filter).
1492
1493              Pre-emphasis was applied in the mastering of some CDs issued  in
1494              the early 1980s.  These included many classical music albums, as
1495              well as now sought-after issues of albums by The  Beatles,  Pink
1496              Floyd  and  others.   Pre-emphasis should be removed at playback
1497              time by a de-emphasis filter in the playback  device.   However,
1498              not  all  modern CD players have this filter, and very few PC CD
1499              drives have it; playing pre-emphasised audio without the correct
1500              de-emphasis filter results in audio that sounds harsh and is far
1501              from what its creators intended.
1502
1503              With the deemph effect, it is possible to  apply  the  necessary
1504              de-emphasis  to  audio that has been extracted from a pre-empha‐
1505              sised CD, and then either burn the de-emphasised audio to a  new
1506              CD  (which will then play correctly on any CD player), or simply
1507              play the correctly de-emphasised audio files  on  the  PC.   For
1508              example:
1509                 sox track1.wav track1-deemph.wav deemph
1510              and then burn track1-deemph.wav to CD, or
1511                 play track1-deemph.wav
1512              or simply
1513                 play track1.wav deemph
1514              The  de-emphasis  filter is implemented as a biquad and requires
1515              the input audio sample rate to be either 44.1kHz or 48kHz.  Max‐
1516              imum  deviation  from  the  ideal response is only 0.06dB (up to
1517              20kHz).
1518
1519              This effect supports the --plot global option.
1520
1521              See also the bass and treble shelving equalisation effects.
1522
1523       delay {position(=)}
1524              Delay one or more audio channels such that  they  start  at  the
1525              given  position.   For  example,  delay  1.5 +1 3000s delays the
1526              first channel by 1.5 seconds, the second channel by 2.5  seconds
1527              (one  second  more than the previous channel), the third channel
1528              by 3000 samples, and leaves  any  other  channels  that  may  be
1529              present  un-delayed.   The  following (one long) command plays a
1530              chime sound:
1531                 play -n synth -j 3 sin %3 sin %-2 sin %-5 sin %-9 \
1532                   sin %-14 sin %-21 fade h .01 2 1.5 delay \
1533                   1.3 1 .76 .54 .27 remix - fade h 0 2.7 2.5 norm -1
1534              and this plays a guitar chord:
1535                 play -n synth pl G2 pl B2 pl D3 pl G3 pl D4 pl G4 \
1536                   delay 0 .05 .1 .15 .2 .25 remix - fade 0 4 .1 norm -1
1537
1538       dither [-S|-s|-f filter] [-a] [-p precision]
1539              Apply dithering to the audio.   Dithering  deliberately  adds  a
1540              small  amount  of  noise  to the signal in order to mask audible
1541              quantization effects that can occur if the output sample size is
1542              less than 24 bits.  With no options, this effect will add trian‐
1543              gular (TPDF) white noise.  Noise-shaping (only for certain  sam‐
1544              ple  rates)  can be selected with -s.  With the -f option, it is
1545              possible to select a particular noise-shaping  filter  from  the
1546              following   list:   lipshitz,  f-weighted,  modified-e-weighted,
1547              improved-e-weighted, gesemann, shibata,  low-shibata,  high-shi‐
1548              bata.   Note  that  most  filter  types  are available only with
1549              44100Hz sample rate.  The filter types are distinguished by  the
1550              following  properties: audibility of noise, level of (inaudible,
1551              but in some circumstances, otherwise  problematic)  shaped  high
1552              frequency noise, and processing speed.
1553              See  http://sox.sourceforge.net/SoX/NoiseShaping  for  graphs of
1554              the different noise-shaping curves.
1555
1556              The -S option selects a slightly `sloped' TPDF,  biased  towards
1557              higher  frequencies.   It  can  be used at any sampling rate but
1558              below ≈22k, plain TPDF is probably  better,  and  above  ≈  37k,
1559              noise-shaping (if available) is probably better.
1560
1561              The  -a option enables a mode where dithering (and noise-shaping
1562              if applicable) are automatically enabled only when needed.   The
1563              most  likely  use for this is when applying fade in or out to an
1564              already dithered file, so that the redithering applies  only  to
1565              the  faded portions.  However, auto dithering is not fool-proof,
1566              so the fades should be carefully checked for any  noise  modula‐
1567              tion;  if  this occurs, then either re-dither the whole file, or
1568              use trim, fade, and concatencate.
1569
1570              The -p option allows overriding the target precision.
1571
1572              If the SoX global option  -R  option  is  not  given,  then  the
1573              pseudo-random  number generator used to generate the white noise
1574              will be `reseeded', i.e. the generated noise will  be  different
1575              between invocations.
1576
1577              This  effect  should  not  be  followed by any other effect that
1578              affects the audio.
1579
1580              See also the `Dithering' section above.
1581
1582       downsample [factor(2)]
1583              Downsample the signal by an integer factor: Only the  first  out
1584              of each factor samples is retained, the others are discarded.
1585
1586              No decimation filter is applied.  If the input is not a properly
1587              bandlimited baseband signal, aliasing will occur.  This  may  be
1588              desirable, e.g., for frequency translation.
1589
1590              For  a  general  resampling effect with anti-aliasing, see rate.
1591              See also upsample.
1592
1593       earwax Makes audio easier to listen to on headphones.  Adds  `cues'  to
1594              44.1kHz  stereo  (i.e.  audio CD format) audio so that when lis‐
1595              tened to on headphones the stereo image  is  moved  from  inside
1596              your  head  (standard for headphones) to outside and in front of
1597              the listener (standard for speakers).
1598
1599       echo gain-in gain-out <delay decay>
1600              Add echoing to the audio.  Echoes are reflected  sound  and  can
1601              occur  naturally  amongst  mountains (and sometimes large build‐
1602              ings) when talking or shouting;  digital  echo  effects  emulate
1603              this  behaviour and are often used to help fill out the sound of
1604              a single instrument or vocal.  The time difference  between  the
1605              original  signal  and  the reflection is the `delay' (time), and
1606              the loudness of the reflected signal is the  `decay'.   Multiple
1607              echoes can have different delays and decays.
1608
1609              Each  given delay decay pair gives the delay in milliseconds and
1610              the decay (relative to gain-in) of that echo.  Gain-out  is  the
1611              volume  of  the output.  For example: This will make it sound as
1612              if there are twice as many instruments as are actually playing:
1613                 play lead.aiff echo 0.8 0.88 60 0.4
1614              If the delay is very short, then it sound like a (metallic)  ro‐
1615              bot playing music:
1616                 play lead.aiff echo 0.8 0.88 6 0.4
1617              A  longer delay will sound like an open air concert in the moun‐
1618              tains:
1619                 play lead.aiff echo 0.8 0.9 1000 0.3
1620              One mountain more, and:
1621                 play lead.aiff echo 0.8 0.9 1000 0.3 1800 0.25
1622
1623       echos gain-in gain-out <delay decay>
1624              Add a sequence of echoes to the audio.  Each  delay  decay  pair
1625              gives the delay in milliseconds and the decay (relative to gain-
1626              in) of that echo.  Gain-out is the volume of the output.
1627
1628              Like the echo effect, echos stand for `ECHO in Sequel', that  is
1629              the  first  echos  takes the input, the second the input and the
1630              first echos, the third the input and the first  and  the  second
1631              echos,  ... and so on.  Care should be taken using many echos; a
1632              single echos has the same effect as a single echo.
1633
1634              The sample will be bounced twice in symmetric echos:
1635                 play lead.aiff echos 0.8 0.7 700 0.25 700 0.3
1636              The sample will be bounced twice in asymmetric echos:
1637                 play lead.aiff echos 0.8 0.7 700 0.25 900 0.3
1638              The sample will sound as if played in a garage:
1639                 play lead.aiff echos 0.8 0.7 40 0.25 63 0.3
1640
1641       equalizer frequency[k] width[q|o|h|k] gain
1642              Apply a two-pole peaking equalisation (EQ)  filter.   With  this
1643              filter,  the signal-level at and around a selected frequency can
1644              be increased or decreased, whilst (unlike  band-pass  and  band-
1645              reject filters) that at all other frequencies is unchanged.
1646
1647              frequency gives the filter's central frequency in Hz, width, the
1648              band-width, and gain the required gain  or  attenuation  in  dB.
1649              Beware of Clipping when using a positive gain.
1650
1651              In order to produce complex equalisation curves, this effect can
1652              be given several times, each with a different central frequency.
1653
1654              The filter is described in detail in [1].
1655
1656              This effect supports the --plot global option.
1657
1658              See also bass and treble for shelving equalisation effects.
1659
1660       fade [type] fade-in-length [stop-position(=) [fade-out-length]]
1661              Apply a fade effect to the beginning, end, or both of the audio.
1662
1663              An optional type can be specified to select  the  shape  of  the
1664              fade  curve:  q  for  quarter  of a sine wave, h for half a sine
1665              wave, t for linear (`triangular') slope, l for logarithmic,  and
1666              p for inverted parabola.  The default is logarithmic.
1667
1668              A  fade-in  starts  from  the  first sample and ramps the signal
1669              level from 0 to full volume over  the  time  given  as  fade-in-
1670              length.  Specify 0 if no fade-in is wanted.
1671
1672              For  fade-outs, the audio will be truncated at stop-position and
1673              the signal level will be ramped from full volume down to 0  over
1674              an  interval  of  fade-out-length  before the stop-position.  If
1675              fade-out-length is not specified, it defaults to the same  value
1676              as fade-in-length.  No fade-out is performed if stop-position is
1677              not specified.  If the audio length can be determined  from  the
1678              input  file  header  and  any previous effects, then -0 (or, for
1679              historical reasons, 0) may be  specified  for  stop-position  to
1680              indicate  the  usual  case of a fade-out that ends at the end of
1681              the input audio stream.
1682
1683              Any time specification may be used for fade-in-length and  fade-
1684              out-length.
1685
1686              See also the splice effect.
1687
1688       fir [coefs-file|coefs]
1689              Use  SoX's  FFT convolution engine with given FIR filter coeffi‐
1690              cients.  If a single argument is given then this is  treated  as
1691              the  name  of  a file containing the filter coefficients (white-
1692              space separated; may contain `#' comments).  If the given  file‐
1693              name  is  `-', or if no argument is given, then the coefficients
1694              are read from the `standard input' (stdin);  otherwise,  coeffi‐
1695              cients may be given on the command line.  Examples:
1696                 sox infile outfile fir 0.0195 -0.082 0.234 0.891 -0.145 0.043
1697                 sox infile outfile fir coefs.txt
1698              with coefs.txt containing
1699                 # HP filter
1700                 # freq=10000
1701                   1.2311233052619888e-01
1702                  -4.4777096106211783e-01
1703                   5.1031563346705155e-01
1704                  -6.6502926320995331e-02
1705                 ...
1706
1707              This effect supports the --plot global option.
1708
1709       flanger [delay depth regen width speed shape phase interp]
1710              Apply  a  flanging  effect to the audio.  See [3] for a detailed
1711              description of flanging.
1712
1713              All parameters are optional (right to left).
1714
1715                        Range     Default   Description
1716              delay     0 - 30       0      Base delay in milliseconds.
1717              depth     0 - 10       2      Added swept delay in milliseconds.
1718              regen    -95 - 95      0      Percentage regeneration (delayed
1719                                            signal feedback).
1720              width    0 - 100      71      Percentage of delayed signal mixed
1721                                            with original.
1722              speed    0.1 - 10     0.5     Sweeps per second (Hz).
1723              shape                 sin     Swept wave shape: sine|triangle.
1724              phase    0 - 100      25      Swept wave percentage phase-shift
1725                                            for multi-channel (e.g. stereo)
1726                                            flange; 0 = 100 = same phase on
1727                                            each channel.
1728              interp                lin     Digital delay-line interpolation:
1729                                            linear|quadratic.
1730
1731       gain [-e|-B|-b|-r] [-n] [-l|-h] [gain-dB]
1732              Apply amplification or attenuation to the audio signal,  or,  in
1733              some  cases,  to  some of its channels.  Note that use of any of
1734              -e, -B, -b, -r, or -n requires temporary file space to store the
1735              audio  to  be  processed,  so  may  be  unsuitable  for use with
1736              `streamed' audio.
1737
1738              Without other options, gain-dB is  used  to  adjust  the  signal
1739              power  level  by  the  given  number  of  dB: positive amplifies
1740              (beware of Clipping), negative attenuates.  With other  options,
1741              the  gain-dB amplification or attenuation is (logically) applied
1742              after the processing due to those options.
1743
1744              Given the -e option, the levels  of  the  audio  channels  of  a
1745              multi-channel file are `equalised', i.e.  gain is applied to all
1746              channels other than that with the highest peak level, such  that
1747              all  channels attain the same peak level (but, without also giv‐
1748              ing -n, the audio is not `normalised').
1749
1750              The -B (balance) option is similar to -e, but with -B,  the  RMS
1751              level  is  used  instead of the peak level.  -B might be used to
1752              correct stereo imbalance caused by an imperfect record turntable
1753              cartridge.   Note that unlike -e, -B might cause some clipping.
1754
1755              -b is similar to -B but has clipping protection, i.e.  if neces‐
1756              sary  to  prevent  clipping  whilst  balancing,  attenuation  is
1757              applied  to  all  channels.   Note, however, that in conjunction
1758              with -n, -B and -b are synonymous.
1759
1760              The -r option is used in conjunction with a prior invocation  of
1761              gain with the -h option - see below for details.
1762
1763              The  -n option normalises the audio to 0dB FSD; it is often used
1764              in conjunction with a negative gain-dB to the  effect  that  the
1765              audio is normalised to a given level below 0dB.  For example,
1766                 sox infile outfile gain -n
1767              normalises to 0dB, and
1768                 sox infile outfile gain -n -3
1769              normalises to -3dB.
1770
1771              The -l option invokes a simple limiter, e.g.
1772                 sox infile outfile gain -l 6
1773              will  apply 6dB of gain but never clip.  Note that limiting more
1774              than a few dBs more than occasionally (in a piece of  audio)  is
1775              not  recommended  as  it  can cause audible distortion.  See the
1776              compand effect for a more capable limiter.
1777
1778              The -h option is used to apply gain  to  provide  head-room  for
1779              subsequent processing.  For example, with
1780                 sox infile outfile gain -h bass +6
1781              6dB  of  attenuation  will be applied prior to the bass boosting
1782              effect thus ensuring that it will not  clip.   Of  course,  with
1783              bass,  it  is obvious how much headroom will be needed, but with
1784              other effects (e.g.  rate, dither) it is not  always  as  clear.
1785              Another  advantage  of  using  gain  -h  rather than an explicit
1786              attenuation, is that if the headroom is not used  by  subsequent
1787              effects, it can be reclaimed with gain -r, for example:
1788                 sox infile outfile gain -h bass +6 rate 44100 gain -r
1789              The above effects chain guarantees never to clip nor amplify; it
1790              attenuates if necessary to prevent clipping, but by only as much
1791              as is needed to do so.
1792
1793              Output  formatting  (dithering  and  bit-depth  reduction)  also
1794              requires headroom (which cannot be `reclaimed'), e.g.
1795                 sox infile outfile gain -h bass +6 rate 44100 gain -rh dither
1796              Here, the second gain invocation, reclaims as much of the  head‐
1797              room  as  it can from the preceding effects, but retains as much
1798              headroom as is needed for subsequent processing.  The SoX global
1799              option  -G can be given to automatically invoke gain -h and gain
1800              -r.
1801
1802              See also the norm and vol effects.
1803
1804       highpass|lowpass [-1|-2] frequency[k] [width[q|o|h|k]]
1805              Apply a high-pass or low-pass filter with 3dB  point  frequency.
1806              The  filter  can be either single-pole (with -1), or double-pole
1807              (the default, or with -2).  width applies  only  to  double-pole
1808              filters;  the  default  is  Q  =  0.707  and gives a Butterworth
1809              response.  The filters roll off at 6dB per pole per octave (20dB
1810              per  pole per decade).  The double-pole filters are described in
1811              detail in [1].
1812
1813              These effects support the --plot global option.
1814
1815              See also sinc for filters with a steeper roll-off.
1816
1817       hilbert [-n taps]
1818              Apply an odd-tap Hilbert transform  filter,  phase-shifting  the
1819              signal by 90 degrees.
1820
1821              This is used in many matrix coding schemes and for analytic sig‐
1822              nal generation.  The process is often written as  a  multiplica‐
1823              tion by i (or j), the imaginary unit.
1824
1825              An  odd-tap Hilbert transform filter has a bandpass characteris‐
1826              tic, attenuating the lowest and highest frequencies.  Its  band‐
1827              width  can be controlled by the number of filter taps, which can
1828              be specified with -n.  By default, the number of taps is  chosen
1829              for a cutoff frequency of about 75 Hz.
1830
1831              This effect supports the --plot global option.
1832
1833       ladspa [-l|-r] module [plugin] [argument ...]
1834              Apply  a  LADSPA [5] (Linux Audio Developer's Simple Plugin API)
1835              plugin.  Despite the name, LADSPA is not Linux-specific,  and  a
1836              wide  range  of  effects is available as LADSPA plugins, such as
1837              cmt [6] (the Computer Music Toolkit) and Steve  Harris's  plugin
1838              collection  [7].  The  first  argument is the plugin module, the
1839              second the name of the plugin (a module can  contain  more  than
1840              one  plugin),  and any other arguments are for the control ports
1841              of the plugin. Missing arguments are supplied by default  values
1842              if possible.
1843
1844              Normally, the number of input ports of the plugin must match the
1845              number of input channels, and the number of output ports  deter‐
1846              mines  the  output  channel  count.  However, the -r (replicate)
1847              option allows cloning a  mono  plugin  to  handle  multi-channel
1848              input.
1849
1850              Some  plugins introduce latency which SoX may optionally compen‐
1851              sate for.  The -l (latency  compensation)  option  automatically
1852              compensates  for latency as reported by the plugin via an output
1853              control port named "latency".
1854
1855              If found, the environment variable LADSPA_PATH will be  used  as
1856              search path for plugins.
1857
1858       loudness [gain [reference]]
1859              Loudness  control  -  similar  to  the gain effect, but provides
1860              equalisation   for   the    human    auditory    system.     See
1861              http://en.wikipedia.org/wiki/Loudness for a detailed description
1862              of loudness.  The gain is adjusted by the given  gain  parameter
1863              (usually negative) and the signal equalised according to ISO 226
1864              w.r.t. a reference level of 65dB, though an  alternative  refer‐
1865              ence level may be given if the original audio has been equalised
1866              for some other optimal level.  A default gain of -10dB  is  used
1867              if a gain value is not given.
1868
1869              See also the gain effect.
1870
1871       lowpass [-1|-2] frequency[k] [width[q|o|h|k]]
1872              Apply  a  low-pass  filter.  See the description of the highpass
1873              effect for details.
1874
1875       mcompand "attack1,decay1{,attack2,decay2}
1876              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB2,out-dB2}
1877              [gain    [initial-volume-dB    [delay]]]"     {crossover-freq[k]
1878              "attack1,..."}
1879
1880              The multi-band compander is similar to the single-band compander
1881              but the audio is first divided into bands  using  Linkwitz-Riley
1882              cross-over filters and a separately specifiable compander run on
1883              each band.  See the compand effect for  the  definition  of  its
1884              parameters.   Compand  parameters  are  specified between double
1885              quotes and the crossover frequency for that  band  is  given  by
1886              crossover-freq; these can be repeated to create multiple bands.
1887
1888              For  example,  the following (one long) command shows how multi-
1889              band companding is typically used in FM radio:
1890                 play track1.wav gain -3 sinc 8000- 29 100 mcompand \
1891                   "0.005,0.1 -47,-40,-34,-34,-17,-33" 100 \
1892                   "0.003,0.05 -47,-40,-34,-34,-17,-33" 400 \
1893                   "0.000625,0.0125 -47,-40,-34,-34,-15,-33" 1600 \
1894                   "0.0001,0.025 -47,-40,-34,-34,-31,-31,-0,-30" 6400 \
1895                   "0,0.025 -38,-31,-28,-28,-0,-25" \
1896                   gain 15 highpass 22 highpass 22 sinc -n 255 -b 16 -17500 \
1897                   gain 9 lowpass -1 17801
1898              The audio file is played with a simulated  FM  radio  sound  (or
1899              broadcast  signal  condition if the lowpass filter at the end is
1900              skipped).  Note that the pipeline is set up with  US-style  75us
1901              pre-emphasis.
1902
1903              See also compand for a single-band companding effect.
1904
1905       noiseprof [profile-file]
1906              Calculate  a  profile  of  the audio for use in noise reduction.
1907              See the description of the noisered effect for details.
1908
1909       noisered [profile-file [amount]]
1910              Reduce noise in the audio signal  by  profiling  and  filtering.
1911              This effect is moderately effective at removing consistent back‐
1912              ground noise such as hiss or hum.  To use it, first run SoX with
1913              the  noiseprof  effect  on a section of audio that ideally would
1914              contain silence but in fact contains noise - such  sections  are
1915              typically  found  at  the  beginning  or the end of a recording.
1916              noiseprof will write out a noise profile to profile-file, or  to
1917              stdout if no profile-file or if `-' is given.  E.g.
1918                 sox speech.wav -n trim 0 1.5 noiseprof speech.noise-profile
1919              To  actually remove the noise, run SoX again, this time with the
1920              noisered effect; noisered will reduce noise according to a noise
1921              profile  (which  was generated by noiseprof), from profile-file,
1922              or from stdin if no profile-file or if `-' is given.  E.g.
1923                 sox speech.wav cleaned.wav noisered speech.noise-profile 0.3
1924              How much noise should be removed is specified by amount-a number
1925              between  0  and  1  with  a default of 0.5.  Higher numbers will
1926              remove more noise but present a greater likelihood  of  removing
1927              wanted  components  of  the  audio  signal.  Before replacing an
1928              original recording with a noise-reduced version, experiment with
1929              different  amount values to find the optimal one for your audio;
1930              use headphones to check that you are  happy  with  the  results,
1931              paying particular attention to quieter sections of the audio.
1932
1933              On  most systems, the two stages - profiling and reduction - can
1934              be combined using a pipe, e.g.
1935                 sox noisy.wav -n trim 0 1 noiseprof | play noisy.wav noisered
1936
1937       norm [dB-level]
1938              Normalise the audio.  norm is just an alias for gain -n; see the
1939              gain effect for details.
1940
1941       oops   Out  Of  Phase  Stereo  effect.  Mixes stereo to twin-mono where
1942              each mono channel contains the difference between the  left  and
1943              right stereo channels.  This is sometimes known as the `karaoke'
1944              effect as it often has the effect of removing most or all of the
1945              vocals from a recording.  It is equivalent to remix 1,2i 1,2i.
1946
1947       overdrive [gain(20) [colour(20)]]
1948              Non linear distortion.  The colour parameter controls the amount
1949              of even harmonic content in the over-driven output.
1950
1951       pad { length[@position(=)] }
1952              Pad the audio with silence, at the beginning, the  end,  or  any
1953              specified  points  through  the  audio.  length is the amount of
1954              silence to insert and position the position in the  input  audio
1955              stream  at  which to insert it.  Any number of lengths and posi‐
1956              tions may be specified, provided that a  specified  position  is
1957              not  less  that the previous one, and any time specification may
1958              be used for them.  position is optional for the first  and  last
1959              lengths specified and if omitted correspond to the beginning and
1960              the end of the audio respectively.  For  example,  pad  1.5  1.5
1961              adds  1.5  seconds  of silence padding at each end of the audio,
1962              whilst pad 4000s@3:00 inserts 4000 samples of silence 3  minutes
1963              into  the  audio.   If  silence is wanted only at the end of the
1964              audio, specify either the end position or specify a  zero-length
1965              pad at the start.
1966
1967              See  also delay for an effect that can add silence at the begin‐
1968              ning of the audio on a channel-by-channel basis.
1969
1970       phaser gain-in gain-out delay decay speed [-s|-t]
1971              Add a phasing effect to the  audio.   See  [3]  for  a  detailed
1972              description of phasing.
1973
1974              delay/decay/speed  gives the delay in milliseconds and the decay
1975              (relative to gain-in) with a modulation speed in Hz.  The  modu‐
1976              lation  is  either  sinusoidal  (-s)   - preferable for multiple
1977              instruments, or triangular (-t)  - gives  single  instruments  a
1978              sharper  phasing  effect.   The decay should be less than 0.5 to
1979              avoid feedback, and usually no less than 0.1.  Gain-out  is  the
1980              volume of the output.
1981
1982              For example:
1983                 play snare.flac phaser 0.8 0.74 3 0.4 0.5 -t
1984              Gentler:
1985                 play snare.flac phaser 0.9 0.85 4 0.23 1.3 -s
1986              A popular sound:
1987                 play snare.flac phaser 0.89 0.85 1 0.24 2 -t
1988              More severe:
1989                 play snare.flac phaser 0.6 0.66 3 0.6 2 -t
1990
1991       pitch [-q] shift [segment [search [overlap]]]
1992              Change the audio pitch (but not tempo).
1993
1994              shift  gives  the  pitch  shift  as positive or negative `cents'
1995              (i.e. 100ths of  a  semitone).   See  the  tempo  effect  for  a
1996              description of the other parameters.
1997
1998              See also the bend, speed, and tempo effects.
1999
2000       rate [-q|-l|-m|-h|-v] [override-options] RATE[k]
2001              Change  the audio sampling rate (i.e. resample the audio) to any
2002              given RATE (even non-integer if this is supported by the  output
2003              file format) using a quality level defined as follows:
2004
2005                           Quality   Band-   Rej dB   Typical Use
2006                                     width
2007                     -q     quick     n/a    ≈30 @    playback on
2008                                              Fs/4    ancient hardware
2009                     -l      low      80%     100     playback on old
2010                                                      hardware
2011                     -m    medium     95%     100     audio playback
2012
2013
2014                     -h     high      95%     125     16-bit mastering
2015                                                      (use with dither)
2016                     -v   very high   95%     175     24-bit mastering
2017
2018              where Band-width is the percentage of the audio  frequency  band
2019              that  is  preserved  and Rej dB is the level of noise rejection.
2020              Increasing levels of resampling quality come at the  expense  of
2021              increasing  amounts of time to process the audio.  If no quality
2022              option is given, the quality  level  used  is  `high'  (but  see
2023              `Playing & Recording Audio' above regarding playback).
2024
2025              The  `quick'  algorithm uses cubic interpolation; all others use
2026              band-limited interpolation.  By default, all algorithms  have  a
2027              `linear'  phase  response; for `medium', `high' and `very high',
2028              the phase response is configurable (see below).
2029
2030              The rate effect is invoked  automatically  if  SoX's  -r  option
2031              specifies a rate that is different to that of the input file(s).
2032              Alternatively, if this effect is given explicitly, then SoX's -r
2033              option  need  not be given.  For example, the following two com‐
2034              mands are equivalent:
2035                 sox input.wav -r 48k output.wav bass -b 24
2036                 sox input.wav        output.wav bass -b 24 rate 48k
2037              though the second command is more flexible  as  it  allows  rate
2038              options  to be given, and allows the effects to be ordered arbi‐
2039              trarily.
2040
2041                                    *        *        *
2042
2043              Warning: technically detailed discussion follows.
2044
2045              The simple quality selection described above  provides  settings
2046              that satisfy the needs of the vast majority of resampling tasks.
2047              Occasionally, however, it may  be  desirable  to  fine-tune  the
2048              resampler's  filter  response;  this can be achieved using over‐
2049              ride options, as detailed in the following table:
2050
2051              -M/-I/-L     Phase response = minimum/intermediate/linear
2052              -s           Steep filter (band-width = 99%)
2053              -a           Allow aliasing/imaging above the pass-band
2054              -b 74-99.7   Any band-width %
2055              -p 0-100     Any phase response (0 = minimum, 25 = intermediate,
2056                           50 = linear, 100 = maximum)
2057
2058              N.B.   Override options cannot be used with the `quick' or `low'
2059              quality algorithms.
2060
2061              All resamplers use filters  that  can  sometimes  create  `echo'
2062              (a.k.a.   `ringing')  artefacts  with  transient signals such as
2063              those that occur with `finger snaps' or other highly  percussive
2064              sounds.   Such  artefacts  are much more noticeable to the human
2065              ear if they occur before the transient (`pre-echo') than if they
2066              occur  after  it (`post-echo').  Note that frequency of any such
2067              artefacts is related to the smaller of the original and new sam‐
2068              pling rates but that if this is at least 44.1kHz, then the arte‐
2069              facts will lie outside the range of human hearing.
2070
2071              A phase response setting may be used to control the distribution
2072              of  any  transient  echo  between `pre' and `post': with minimum
2073              phase, there is no pre-echo but the longest post-echo; with lin‐
2074              ear  phase,  pre  and  post echo are in equal amounts (in signal
2075              terms, but not audibility terms); the intermediate phase setting
2076              attempts to find the best compromise by selecting a small length
2077              (and level) of pre-echo and a medium lengthed post-echo.
2078
2079              Minimum, intermediate, or  linear  phase  response  is  selected
2080              using  the  -M, -I, or -L option; a custom phase response can be
2081              created with the -p option.  Note that phase  responses  between
2082              `linear' and `maximum' (greater than 50) are rarely useful.
2083
2084              A resampler's band-width setting determines how much of the fre‐
2085              quency content of the original signal (w.r.t. the original  sam‐
2086              ple rate when up-sampling, or the new sample rate when down-sam‐
2087              pling) is preserved during conversion.  The term `pass-band'  is
2088              used  to  refer  to  all  frequencies up to the band-width point
2089              (e.g. for 44.1kHz sampling rate, and a resampling band-width  of
2090              95%,  the  pass-band  represents  frequencies from 0Hz (D.C.) to
2091              circa 21kHz).  Increasing the resampler's band-width results  in
2092              a  slower  conversion  and can increase transient echo artefacts
2093              (and vice versa).
2094
2095              The -s `steep filter' option changes resampling band-width  from
2096              the default 95% (based on the 3dB point), to 99%.  The -b option
2097              allows the band-width to be  set  to  any  value  in  the  range
2098              74-99.7  %, but note that band-width values greater than 99% are
2099              not recommended for normal use as they can cause excessive tran‐
2100              sient echo.
2101
2102              If the -a option is given, then aliasing/imaging above the pass-
2103              band is allowed.  For example, with 44.1kHz sampling rate, and a
2104              resampling  band-width of 95%, this means that frequency content
2105              above 21kHz can be distorted; however, since this is  above  the
2106              pass-band  (i.e.   above the highest frequency of interest/audi‐
2107              bility), this may not be a problem.  The  benefits  of  allowing
2108              aliasing/imaging  are  reduced  processing time, and reduced (by
2109              almost half) transient echo artefacts.  Note that if this option
2110              is   given,  then  the  minimum  band-width  allowable  with  -b
2111              increases to 85%.
2112
2113              Examples:
2114                 sox input.wav -b 16 output.wav rate -s -a 44100 dither -s
2115              default (high)  quality  resampling;  overrides:  steep  filter,
2116              allow  aliasing;  to 44.1kHz sample rate; noise-shaped dither to
2117              16-bit WAV file.
2118                 sox input.wav -b 24 output.aiff rate -v -I -b 90 48k
2119              very high quality  resampling;  overrides:  intermediate  phase,
2120              band-width  90%; to 48k sample rate; store output to 24-bit AIFF
2121              file.
2122
2123                                    *        *        *
2124
2125              The pitch and speed effects use the rate effect at their core.
2126
2127       remix [-a|-m|-p] <out-spec>
2128              out-spec  = in-spec{,in-spec} | 0
2129              in-spec   = [in-chan][-[in-chan2]][vol-spec]
2130              vol-spec  = p|i|v[volume]
2131
2132              Select and mix input audio channels into output audio  channels.
2133              Each  output channel is specified, in turn, by a given out-spec:
2134              a list of contributing input channels and volume specifications.
2135
2136              Note that this effect operates on the audio channels within  the
2137              SoX effects processing chain; it should not be confused with the
2138              -m global option (where multiple files are  mix-combined  before
2139              entering the effects chain).
2140
2141              An  out-spec  contains comma-separated input channel-numbers and
2142              hyphen-delimited channel-number ranges; alternatively, 0 may  be
2143              given to create a silent output channel.  For example,
2144                 sox input.wav output.wav remix 6 7 8 0
2145              creates  an output file with four channels, where channels 1, 2,
2146              and 3 are copies of channels 6, 7, and 8 in the input file,  and
2147              channel 4 is silent.  Whereas
2148                 sox input.wav output.wav remix 1-3,7 3
2149              creates  a  (somewhat bizarre) stereo output file where the left
2150              channel is a mix-down of input channels 1, 2, 3, and 7, and  the
2151              right channel is a copy of input channel 3.
2152
2153              Where  a  range of channels is specified, the channel numbers to
2154              the left and right of the hyphen are optional and default  to  1
2155              and to the number of input channels respectively. Thus
2156                 sox input.wav output.wav remix -
2157              performs a mix-down of all input channels to mono.
2158
2159              By  default,  where an output channel is mixed from multiple (n)
2160              input channels, each input channel will be scaled by a factor of
2161              ¹/n.   Custom  mixing  volumes  can  be set by following a given
2162              input channel or range of input channels with a vol-spec (volume
2163              specification).  This is one of the letters p, i, or v, followed
2164              by a volume number, the meaning of which depends  on  the  given
2165              letter and is defined as follows:
2166
2167                      Letter   Volume number        Notes
2168                        p      power adjust in dB   0 = no change
2169
2170                        i      power adjust in dB   As `p', but invert
2171                                                    the audio
2172                        v      voltage multiplier   1 = no change, 0.5
2173                                                    ≈ 6dB attenuation,
2174                                                    2 ≈ 6dB gain, -1 =
2175                                                    invert
2176
2177              If  an out-spec includes at least one vol-spec then, by default,
2178              ¹/n scaling is not applied to any other  channels  in  the  same
2179              out-spec (though may be in other out-specs).  The -a (automatic)
2180              option however, can be given to retain the automatic scaling  in
2181              this case.  For example,
2182                 sox input.wav output.wav remix 1,2 3,4v0.8
2183              results in channel level multipliers of 0.5,0.5 1,0.8, whereas
2184                 sox input.wav output.wav remix -a 1,2 3,4v0.8
2185              results in channel level multipliers of 0.5,0.5 0.5,0.8.
2186
2187              The  -m  (manual)  option  disables all automatic volume adjust‐
2188              ments, so
2189                 sox input.wav output.wav remix -m 1,2 3,4v0.8
2190              results in channel level multipliers of 1,1 1,0.8.
2191
2192              The volume number is optional and omitting it corresponds to  no
2193              volume change; however, the only case in which this is useful is
2194              in conjunction with i.  For example,  if  input.wav  is  stereo,
2195              then
2196                 sox input.wav output.wav remix 1,2i
2197              is a mono equivalent of the oops effect.
2198
2199              If  the  -p  option  is given, then any automatic ¹/n scaling is
2200              replaced by ¹/√n (`power') scaling; this gives a louder mix  but
2201              one that might occasionally clip.
2202
2203                                    *        *        *
2204
2205              One use of the remix effect is to split an audio file into a set
2206              of files, each containing one of the  constituent  channels  (in
2207              order to perform subsequent processing on individual audio chan‐
2208              nels).  Where more than a few channels are  involved,  a  script
2209              such as the following (Bourne shell script) is useful:
2210              #!/bin/sh
2211              chans=`soxi -c "$1"`
2212              while [ $chans -ge 1 ]; do
2213                 chans0=`printf %02i $chans`   # 2 digits hence up to 99 chans
2214                 out=`echo "$1"|sed "s/\(.*\)\.\(.*\)/\1-$chans0.\2/"`
2215                 sox "$1" "$out" remix $chans
2216                 chans=`expr $chans - 1`
2217              done
2218              If  a  file  input.wav containing six audio channels were given,
2219              the  script  would  produce  six  output  files:   input-01.wav,
2220              input-02.wav, ..., input-06.wav.
2221
2222              See also the swap effect.
2223
2224       repeat [count(1)|-]
2225              Repeat  the  entire  audio  count times, or once if count is not
2226              given.   The  special  value  -  requests  infinite  repetition.
2227              Requires temporary file space to store the audio to be repeated.
2228              Note that repeating once yields two copies: the  original  audio
2229              and the repeated audio.
2230
2231       reverb [-w|--wet-only] [reverberance (50%) [HF-damping (50%)
2232              [room-scale (100%) [stereo-depth (100%)
2233              [pre-delay (0ms) [wet-gain (0dB)]]]]]]
2234
2235              Add  reverberation  to the audio using the `freeverb' algorithm.
2236              A reverberation effect is sometimes desirable for concert  halls
2237              that  are  too  small  or contain so many people that the hall's
2238              natural reverberance is diminished.  Applying a small amount  of
2239              stereo  reverb to a (dry) mono signal will usually make it sound
2240              more natural.  See [3] for a detailed description of  reverbera‐
2241              tion.
2242
2243              Note  that  this effect increases both the volume and the length
2244              of the audio, so to prevent clipping in these domains, a typical
2245              invocation might be:
2246                 play dry.wav gain -3 pad 0 3 reverb
2247              The -w option can be given to select only the `wet' signal, thus
2248              allowing it to be processed further, independently of the  `dry'
2249              signal.  E.g.
2250                 play -m voice.wav "|sox voice.wav -p reverse reverb -w reverse"
2251              for a reverse reverb effect.
2252
2253       reverse
2254              Reverse  the audio completely.  Requires temporary file space to
2255              store the audio to be reversed.
2256
2257       riaa   Apply RIAA vinyl playback equalisation.  The sampling rate  must
2258              be one of: 44.1, 48, 88.2, 96 kHz.
2259
2260              This effect supports the --plot global option.
2261
2262       silence [-l] above-periods [duration threshold[d|%]
2263              [below-periods duration threshold[d|%]]
2264
2265              Removes silence from the beginning, middle, or end of the audio.
2266              `Silence' is determined by a specified threshold.
2267
2268              The above-periods value is used to indicate if audio  should  be
2269              trimmed at the beginning of the audio. A value of zero indicates
2270              no silence should be trimmed from the beginning. When specifying
2271              a  non-zero above-periods, it trims audio up until it finds non-
2272              silence. Normally, when trimming silence from beginning of audio
2273              the  above-periods  will  be 1 but it can be increased to higher
2274              values to trim all audio up to a specific count  of  non-silence
2275              periods.  For  example,  if you had an audio file with two songs
2276              that each contained 2 seconds of silence before  the  song,  you
2277              could  specify  an  above-period  of 2 to strip out both silence
2278              periods and the first song.
2279
2280              When above-periods is non-zero, you must also specify a duration
2281              and  threshold.  duration indicates the amount of time that non-
2282              silence must be detected before  it  stops  trimming  audio.  By
2283              increasing  the  duration,  burst  of  noise  can  be treated as
2284              silence and trimmed off.
2285
2286              threshold is used to indicate what sample value you should treat
2287              as silence.  For digital audio, a value of 0 may be fine but for
2288              audio recorded from analog, you may wish to increase  the  value
2289              to account for background noise.
2290
2291              When  optionally trimming silence from the end of the audio, you
2292              specify a below-periods count.  In this case, below-period means
2293              to  remove  all audio after silence is detected.  Normally, this
2294              will be a value 1 of but it can be increased to skip over  peri‐
2295              ods of silence that are wanted.  For example, if you have a song
2296              with 2 seconds of silence in the middle and 2 second at the end,
2297              you  could  set  below-period  to  a value of 2 to skip over the
2298              silence in the middle of the audio.
2299
2300              For below-periods, duration specifies a period of  silence  that
2301              must exist before audio is not copied any more.  By specifying a
2302              higher duration, silence that is  wanted  can  be  left  in  the
2303              audio.   For example, if you have a song with an expected 1 sec‐
2304              ond of silence in the middle and 2 seconds  of  silence  at  the
2305              end, a duration of 2 seconds could be used to skip over the mid‐
2306              dle silence.
2307
2308              Unfortunately, you must know the length of the  silence  at  the
2309              end  of  your  audio file to trim off silence reliably.  A work‐
2310              around is to use the silence  effect  in  combination  with  the
2311              reverse  effect.   By first reversing the audio, you can use the
2312              above-periods to reliably trim all audio from  what  looks  like
2313              the  front of the file.  Then reverse the file again to get back
2314              to normal.
2315
2316              To remove silence from the middle of a file,  specify  a  below-
2317              periods that is negative.  This value is then treated as a posi‐
2318              tive value and is also used to indicate that the  effect  should
2319              restart  processing as specified by the above-periods, making it
2320              suitable for removing periods of silence in the  middle  of  the
2321              audio.
2322
2323              The  option  -l  indicates that below-periods duration length of
2324              audio should be left intact at the beginning of each  period  of
2325              silence.  For example, if you want to remove long pauses between
2326              words but do not want to remove the pauses completely.
2327
2328              duration is a time specification with  the  peculiarity  that  a
2329              bare number is interpreted as a sample count, not as a number of
2330              seconds.  For specifying seconds, either use the t suffix (as in
2331              `2t') or specify minutes, too (as in `0:02').
2332
2333              threshold  numbers  may be suffixed with d to indicate the value
2334              is in decibels, or % to indicate a percentage of  maximum  value
2335              of the sample value (0% specifies pure digital silence).
2336
2337              The following example shows how this effect can be used to start
2338              a recording that does not contain the delay at the  start  which
2339              usually  occurs  between  `pressing  the  record button' and the
2340              start of the performance:
2341                 rec parameters filename other-effects silence 1 5 2%
2342
2343       sinc [-a att|-b beta] [-p phase|-M|-I|-L] [-t tbw|-n taps] [freqHP]
2344       [-freqLP [-t tbw|-n taps]]
2345              Apply  a sinc kaiser-windowed low-pass, high-pass, band-pass, or
2346              band-reject filter to the signal.  The freqHP and freqLP parame‐
2347              ters  give  the frequencies of the 6dB points of a high-pass and
2348              low-pass filter that may be invoked individually,  or  together.
2349              If  both are given, then freqHP less than freqLP creates a band-
2350              pass filter, freqHP greater than freqLP  creates  a  band-reject
2351              filter.  For example, the invocations
2352                 sinc 3k
2353                 sinc -4k
2354                 sinc 3k-4k
2355                 sinc 4k-3k
2356              create  a high-pass, low-pass, band-pass, and band-reject filter
2357              respectively.
2358
2359              The default stop-band attenuation of  120dB  can  be  overridden
2360              with  -a;  alternatively, the kaiser-window `beta' parameter can
2361              be given directly with -b.
2362
2363              The default transition band-width of 5% of the total band can be
2364              overridden with -t (and tbw in Hertz); alternatively, the number
2365              of filter taps can be given directly with -n.
2366
2367              If both freqHP and freqLP are given, then  a  -t  or  -n  option
2368              given  to  the  left of the frequencies applies to both frequen‐
2369              cies; one of these options given to the right of the frequencies
2370              applies only to freqLP.
2371
2372              The  -p,  -M,  -I,  and  -L  options  control the filter's phase
2373              response; see the rate effect for details.
2374
2375              This effect supports the --plot global option.
2376
2377       spectrogram [options]
2378              Create a spectrogram of the audio; the audio is  passed  unmodi‐
2379              fied  through the SoX processing chain.  This effect is optional
2380              - type sox --help and check the list of supported effects to see
2381              if it has been included.
2382
2383              The  spectrogram is rendered in a Portable Network Graphic (PNG)
2384              file, and shows time in the X-axis, frequency in the Y-axis, and
2385              audio  signal magnitude in the Z-axis.  Z-axis values are repre‐
2386              sented by the colour (or optionally the intensity) of the pixels
2387              in  the  X-Y plane.  If the audio signal contains multiple chan‐
2388              nels then these are shown from top to bottom starting from chan‐
2389              nel 1 (which is the left channel for stereo audio).
2390
2391              For example, if `my.wav' is a stereo file, then with
2392                 sox my.wav -n spectrogram
2393              a  spectrogram  of  the  entire file will be created in the file
2394              `spectrogram.png'.  More often though,  analysis  of  a  smaller
2395              portion of the audio is required; e.g. with
2396                 sox my.wav -n remix 2 trim 20 30 spectrogram
2397              the  spectrogram  shows information only from the second (right)
2398              channel, and of thirty seconds of  audio  starting  from  twenty
2399              seconds in.  To analyse a small portion of the frequency domain,
2400              the rate effect may be used, e.g.
2401                 sox my.wav -n rate 6k spectrogram
2402              allows detailed analysis of frequencies up  to  3kHz  (half  the
2403              sampling rate) i.e. where the human auditory system is most sen‐
2404              sitive.  With
2405                 sox my.wav -n trim 0 10 spectrogram -x 600 -y 200 -z 100
2406              the given options control the size of the spectrogram's X, Y & Z
2407              axes  (in  this case, the spectrogram area of the produced image
2408              will be 600 by 200 pixels in size and the Z-axis range  will  be
2409              100  dB).   Note  that  the produced image includes axes legends
2410              etc. and so will be a little larger than the specified  spectro‐
2411              gram size.  In this example:
2412                 sox -n -n synth 6 tri 10k:14k spectrogram -z 100 -w kaiser
2413              an analysis `window' with high dynamic range is selected to best
2414              display the spectrogram of a swept triangular wave.  For a  smi‐
2415              lar  example, append the following to the `chime' command in the
2416              description of the delay effect (above):
2417                 rate 2k spectrogram -X 200 -Z -10 -w kaiser
2418              Options are also available to control  the  appearance  (colour-
2419              set,  brightness,  contrast,  etc.) and filename of the spectro‐
2420              gram; e.g. with
2421                 sox my.wav -n spectrogram -m -l -o print.png
2422              a spectrogram is created suitable for printing on a  `black  and
2423              white' printer.
2424
2425              Options:
2426
2427              -x num Change  the  (maximum)  width (X-axis) of the spectrogram
2428                     from its default value of 800 pixels to  a  given  number
2429                     between 100 and 200000.  See also -X and -d.
2430
2431              -X num X-axis  pixels/second;  the default is auto-calculated to
2432                     fit the given or known audio duration to the X-axis size,
2433                     or  100 otherwise.  If given in conjunction with -d, this
2434                     option affects the width of the  spectrogram;  otherwise,
2435                     it  affects  the duration of the spectrogram.  num can be
2436                     from 1 (low time resolution) to 5000 (high  time  resolu‐
2437                     tion)  and need not be an integer.  SoX may make a slight
2438                     adjustment to the given number for  processing  quantisa‐
2439                     tion  reasons;  if  so, SoX will report the actual number
2440                     used (viewable when  the  SoX  global  option  -V  is  in
2441                     effect).  See also -x and -d.
2442
2443              -y num Sets the Y-axis size in pixels (per channel); this is the
2444                     number of frequency `bins' used in the  Fourier  analysis
2445                     that  produces  the  spectrogram.  N.B. it can be slow to
2446                     produce the spectrogram if this number is  not  one  more
2447                     than  a  power  of two (e.g. 129).  By default the Y-axis
2448                     size is chosen automatically (depending on the number  of
2449                     channels).   See  -Y for alternative way of setting spec‐
2450                     trogram height.
2451
2452              -Y num Sets the target total height of the spectrogram(s).   The
2453                     default  value  is 550 pixels.  Using this option (and by
2454                     default), SoX will choose a height for  individual  spec‐
2455                     trogram channels that is one more than a power of two, so
2456                     the actual total height may fall short of the given  num‐
2457                     ber.  However, there is also a minimum height per channel
2458                     so  if  there  are  many  channels,  the  number  may  be
2459                     exceeded.  See -y for alternative way of setting spectro‐
2460                     gram height.
2461
2462              -z num Z-axis (colour) range in dB, default 120.  This sets  the
2463                     dynamic-range  of  the  spectrogram  to  be  -num dBFS to
2464                     0 dBFS.  Num  may  range  from  20  to  180.   Decreasing
2465                     dynamic-range effectively increases the `contrast' of the
2466                     spectrogram display, and vice versa.
2467
2468              -Z num Sets the upper limit of the Z-axis in dBFS.   A  negative
2469                     num  effectively  increases the `brightness' of the spec‐
2470                     trogram display, and vice versa.
2471
2472              -q num Sets the Z-axis quantisation, i.e. the number of  differ‐
2473                     ent  colours  (or  intensities) in which to render Z-axis
2474                     values.   A  small  number   (e.g.   4)   will   give   a
2475                     `poster'-like  effect  making it easier to discern magni‐
2476                     tude bands of similar level.  Small numbers also  usually
2477                     result  in  small  PNG files.  The number given specifies
2478                     the number of colours to use inside the Z-axis range; two
2479                     colours are reserved to represent out-of-range values.
2480
2481              -w name
2482                     Window:  Hann  (default), Hamming, Bartlett, Rectangular,
2483                     Kaiser or Dolph.  The spectrogram is produced  using  the
2484                     Discrete  Fourier  Transform (DFT) algorithm.  A signifi‐
2485                     cant parameter to this algorithm is the choice of `window
2486                     function'.   By  default,  SoX uses the Hann window which
2487                     has good all-round frequency-resolution and dynamic-range
2488                     properties.   For  better frequency resolution (but lower
2489                     dynamic-range),  select  a  Hamming  window;  for  higher
2490                     dynamic-range (but poorer frequency-resolution), select a
2491                     Dolph window.  Kaiser, Bartlett and  Rectangular  windows
2492                     are also available.
2493
2494              -W num Window  adjustment  parameter.   This can be used to make
2495                     small adjustments to the Kaiser or Dolph window shape.  A
2496                     positive  number (up to ten) increases its dynamic range,
2497                     a negative number decreases it.
2498
2499              -s     Allow slack overlapping of DFT  windows.   This  can,  in
2500                     some  cases,  increase  image  sharpness and give greater
2501                     adherence to the -x value, but at the expense of a little
2502                     spectral loss.
2503
2504              -m     Creates a monochrome spectrogram (the default is colour).
2505
2506              -h     Selects  a  high-colour  palette - less visually pleasing
2507                     than the default colour palette, but it may make it  eas‐
2508                     ier to differentiate different levels.  If this option is
2509                     used in conjunction with -m, the result will be a  hybrid
2510                     monochrome/colour palette.
2511
2512              -p num Permute  the  colours in a colour or hybrid palette.  The
2513                     num parameter, from 1 (the default)  to  6,  selects  the
2514                     permutation.
2515
2516              -l     Creates  a  `printer  friendly'  spectrogram with a light
2517                     background (the default has a dark background).
2518
2519              -a     Suppress the display of the axis lines.   This  is  some‐
2520                     times useful in helping to discern artefacts at the spec‐
2521                     trogram edges.
2522
2523              -r     Raw spectrogram: suppress the display of  axes  and  leg‐
2524                     ends.
2525
2526              -A     Selects  an  alternative, fixed colour-set.  This is pro‐
2527                     vided only for compatibility with  spectrograms  produced
2528                     by another package.  It should not normally be used as it
2529                     has some problems, not least, a lack  of  differentiation
2530                     at  the  bottom end which results in masking of low-level
2531                     artefacts.
2532
2533              -t text
2534                     Set the image title - text to display above the  spectro‐
2535                     gram.
2536
2537              -c text
2538                     Set  (or clear) the image comment - text to display below
2539                     and to the left of the spectrogram.
2540
2541              -o file
2542                     Name of the spectrogram output PNG file,  default  `spec‐
2543                     trogram.png'.   If  `-' is given, the spectrogram will be
2544                     sent to standard output (stdout).
2545
2546              Advanced Options:
2547              In order to process a smaller section of audio without affecting
2548              other  effects or the output signal (unlike when the trim effect
2549              is used), the following options may be used.
2550
2551              -d duration
2552                     This option sets the X-axis resolution  such  that  audio
2553                     with  the  given duration (a time specification) fits the
2554                     selected (or default) X-axis width.  For example,
2555                        sox input.mp3 output.wav -n spectrogram -d 1:00 stats
2556                     creates a spectrogram showing the  first  minute  of  the
2557                     audio, whilst
2558                     the stats effect is applied to the entire audio signal.
2559
2560                     See  also -X for an alternative way of setting the X-axis
2561                     resolution.
2562
2563              -S position(=)
2564                     Start the spectrogram at the given  point  in  the  audio
2565                     stream.  For example
2566                        sox input.aiff output.wav spectrogram -S 1:00
2567                     creates a spectrogram showing all but the first minute of
2568                     the audio (the output file, however, receives the  entire
2569                     audio stream).
2570
2571              For the ability to perform off-line processing of spectral data,
2572              see the stat effect.
2573
2574       speed factor[c]
2575              Adjust the audio speed (pitch and tempo  together).   factor  is
2576              either the ratio of the new speed to the old speed: greater than
2577              1 speeds up, less than 1 slows down, or, if  appended  with  the
2578              letter  `c',  the number of cents (i.e. 100ths of a semitone) by
2579              which the pitch (and tempo) should be adjusted: greater  than  0
2580              increases, less than 0 decreases.
2581
2582              Technically,  the  speed  effect  only  changes  the sample rate
2583              information, leaving the samples themselves untouched.  The rate
2584              effect is invoked automatically to resample to the output sample
2585              rate, using its default quality/speed.  For  higher  quality  or
2586              higher  speed resampling, in addition to the speed effect, spec‐
2587              ify the rate effect with the desired quality option.
2588
2589              See also the bend, pitch, and tempo effects.
2590
2591       splice  [-h|-t|-q] { position(=)[,excess[,leeway]] }
2592              Splice together audio sections.  This effect provides two things
2593              over simple audio concatenation: a (usually short) cross-fade is
2594              applied at the join, and a wave similarity comparison is made to
2595              help determine the best place at which to make the join.
2596
2597              One of the options -h, -t, or -q may be given to select the fade
2598              envelope as half-cosine wave (the default),  triangular  (a.k.a.
2599              linear), or quarter-cosine wave respectively.
2600
2601                     Type   Audio          Fade level       Transitions
2602                      t     correlated     constant gain    abrupt
2603                      h     correlated     constant gain    smooth
2604                      q     uncorrelated   constant power   smooth
2605
2606              To  perform  a  splice,  first use the trim effect to select the
2607              audio sections to be joined together.  As when performing a tape
2608              splice,  the  end  of  the  section to be spliced onto should be
2609              trimmed with a small excess (default  0.005  seconds)  of  audio
2610              after  the ideal joining point.  The beginning of the audio sec‐
2611              tion to splice on should be trimmed with the same excess (before
2612              the  ideal  joining  point),  plus an additional leeway (default
2613              0.005 seconds).  Any time specification may be  used  for  these
2614              parameters.   SoX should then be invoked with the two audio sec‐
2615              tions as input files and the splice effect given with the  posi‐
2616              tion  at  which  to  perform  the splice - this is length of the
2617              first audio section (including the excess).
2618
2619              The following diagram uses the tape analogy  to  illustrate  the
2620              splice  operation.   The  effect simulates the diagonal cuts and
2621              joins the two pieces:
2622
2623                    length1   excess
2624                  -----------><--->
2625                  _________   :   :  _________________
2626                           \  :   : :\     `
2627                            \ :   : : \     `
2628                             \:   : :  \     `
2629                              *   : :   * - - *
2630                               \  : :   :\     `
2631                                \ : :   : \     `
2632                  _______________\: :   :  \_____`____
2633                                    :   :   :     :
2634                                    <--->   <----->
2635                                    excess  leeway
2636
2637              where * indicates the joining points.
2638
2639              For example, a long song begins with two verses which start  (as
2640              determined  e.g. by using the play command with the trim (start)
2641              effect) at times 0:30.125 and 1:03.432.  The following  commands
2642              cut out the first verse:
2643                 sox too-long.wav part1.wav trim 0 30.130
2644              (5 ms excess, after the first verse starts)
2645                 sox too-long.wav part2.wav trim 1:03.422
2646              (5 ms excess plus 5 ms leeway, before the second verse starts)
2647                 sox part1.wav part2.wav just-right.wav splice 30.130
2648              For another example, the SoX command
2649                 play "|sox -n -p synth 1 sin %1" "|sox -n -p synth 1 sin %3"
2650              generates and plays two notes, but there is a nasty click at the
2651              transition; the click can be removed by splicing instead of con‐
2652              catenating the audio, i.e. by appending splice 1 to the command.
2653              (Clicks at the beginning and end of the audio can be removed  by
2654              preceding the splice effect with fade q .01 2 .01).
2655
2656              Provided your arithmetic is good enough, multiple splices can be
2657              performed with a single splice invocation.  For example:
2658              #!/bin/sh
2659              # Audio Copy and Paste Over
2660              # acpo infile copy-start copy-stop paste-over-start outfile
2661              # No chained time specifications allowed for the parameters
2662              # (i.e. such that contain +/-).
2663              e=0.005                      # Using default excess
2664              l=$e                         # and leeway.
2665              sox "$1" piece.wav trim $2-$e-$l =$3+$e
2666              sox "$1" part1.wav trim 0 $4+$e
2667              sox "$1" part2.wav trim $4+$3-$2-$e-$l
2668              sox part1.wav piece.wav part2.wav "$5" \
2669                 splice $4+$e +$3-$2+$e+$l+$e
2670              In the above Bourne shell script, two splices are used to  `copy
2671              and paste' audio.
2672
2673                                    *        *        *
2674
2675              It is also possible to use this effect to perform general cross-
2676              fades, e.g. to join two songs.  In this case, excess would typi‐
2677              cally  be an number of seconds, the -q option would typically be
2678              given (to select an `equal power' cross-fade), and leeway should
2679              be  zero (which is the default if -q is given).  For example, if
2680              f1.wav and f2.wav are audio files to be cross-faded, then
2681                 sox f1.wav f2.wav out.wav splice -q $(soxi -D f1.wav),3
2682              cross-fades the files where the point of  equal  loudness  is  3
2683              seconds  before  the end of f1.wav, i.e. the total length of the
2684              cross-fade is 2 × 3 = 6 seconds (Note: the  $(...)  notation  is
2685              POSIX shell).
2686
2687       stat [-s scale] [-rms] [-freq] [-v] [-d]
2688              Display  time and frequency domain statistical information about
2689              the audio.  Audio is passed unmodified through the SoX  process‐
2690              ing chain.
2691
2692              The  information  is  output  to  the  `standard error' (stderr)
2693              stream and is calculated, where n is the duration of  the  audio
2694              in  samples,  c  is the number of audio channels, r is the audio
2695              sample rate, and xk represents the PCM value (in the range -1 to
2696              +1  by  default) of each successive sample in the audio, as fol‐
2697              lows:
2698
2699               Samples read        n×c
2700               Length (seconds)    n÷r
2701               Scaled by                                 See -s below.
2702               Maximum amplitude   max(xk)               The maximum  sample
2703                                                         value in the audio;
2704                                                         usually  this  will
2705                                                         be  a positive num‐
2706                                                         ber.
2707               Minimum amplitude   min(xk)               The minimum  sample
2708                                                         value in the audio;
2709                                                         usually  this  will
2710                                                         be  a negative num‐
2711                                                         ber.
2712               Midline amplitude   ½min(xk)+½max(xk)
2713               Mean norm           ¹/nΣ│xk│              The average of  the
2714                                                         absolute  value  of
2715                                                         each sample in  the
2716                                                         audio.
2717               Mean amplitude      ¹/nΣxk                The average of each
2718                                                         sample    in    the
2719                                                         audio.    If   this
2720                                                         figure is non-zero,
2721                                                         then  it  indicates
2722                                                         the presence  of  a
2723                                                         D.C.  offset (which
2724                                                         could  be   removed
2725                                                         using  the  dcshift
2726                                                         effect).
2727
2728
2729
2730               RMS amplitude       √(¹/nΣxk²)            The level of a D.C.
2731                                                         signal  that  would
2732                                                         have the same power
2733                                                         as    the   audio's
2734                                                         average power.
2735               Maximum delta       max(│xk-xk-1│)
2736               Minimum delta       min(│xk-xk-1│)
2737               Mean delta          ¹/n-1Σ│xk-xk-1│
2738               RMS delta           √(¹/n-1Σ(xk-xk-1)²)
2739               Rough frequency                           In Hz.
2740               Volume Adjustment                         The  parameter   to
2741                                                         the    vol   effect
2742                                                         which  would   make
2743                                                         the  audio  as loud
2744                                                         as possible without
2745                                                         clipping.     Note:
2746                                                         See the  discussion
2747                                                         on  Clipping  above
2748                                                         for reasons why  it
2749                                                         is  rarely  a  good
2750                                                         idea actually to do
2751                                                         this.
2752
2753              Note  that  the delta measurements are not applicable for multi-
2754              channel audio.
2755
2756              The -s option can be used to scale the input  data  by  a  given
2757              factor.  The default value of scale is 2147483647 (i.e. the max‐
2758              imum value of a 32-bit signed integer).  Internal effects always
2759              work with signed long PCM data and so the value should relate to
2760              this fact.
2761
2762              The -rms option will convert all output average values to  `root
2763              mean square' format.
2764
2765              The -v option displays only the `Volume Adjustment' value.
2766
2767              The  -freq  option  calculates  the input's power spectrum (4096
2768              point DFT) instead of the statistics listed above.  This  should
2769              only be used with a single channel audio file.
2770
2771              The  -d option displays a hex dump of the 32-bit signed PCM data
2772              audio in SoX's internal buffer.  This is  mainly  used  to  help
2773              track  down  endian problems that sometimes occur in cross-plat‐
2774              form versions of SoX.
2775
2776              See also the stats effect.
2777
2778       stats [-b bits|-x bits|-s scale] [-w window-time]
2779              Display time domain  statistical  information  about  the  audio
2780              channels;  audio is passed unmodified through the SoX processing
2781              chain.  Statistics are calculated and displayed for  each  audio
2782              channel and, where applicable, an overall figure is also given.
2783
2784              For example, for a typical well-mastered stereo music file:
2785
2786                                       Overall     Left      Right
2787                          DC offset   0.000803 -0.000391  0.000803
2788                          Min level  -0.750977 -0.750977 -0.653412
2789                          Max level   0.708801  0.708801  0.653534
2790                          Pk lev dB      -2.49     -2.49     -3.69
2791                          RMS lev dB    -19.41    -19.13    -19.71
2792                          RMS Pk dB     -13.82    -13.82    -14.38
2793                          RMS Tr dB     -85.25    -85.25    -82.66
2794                          Crest factor       -      6.79      6.32
2795                          Flat factor     0.00      0.00      0.00
2796                          Pk count           2         2         2
2797                          Bit-depth      16/16     16/16     16/16
2798                          Num samples    7.72M
2799                          Length s     174.973
2800                          Scale max   1.000000
2801                          Window s       0.050
2802
2803              DC offset,  Min level,  and  Max level are shown, by default, in
2804              the range ±1.  If the -b (bits) options  is  given,  then  these
2805              three  measurements  will be scaled to a signed integer with the
2806              given number of bits; for example, for 16 bits, the scale  would
2807              be  -32768  to +32767.  The -x option behaves the same way as -b
2808              except that the signed integer values are displayed in hexadeci‐
2809              mal.   The  -s  option  scales the three measurements by a given
2810              floating-point number.
2811
2812              Pk lev dB and RMS lev dB are standard peak and  RMS  level  mea‐
2813              sured in dBFS.  RMS Pk dB and RMS Tr dB are peak and trough val‐
2814              ues for RMS level measured over a short window (default 50ms).
2815
2816              Crest factor is the standard ratio of peak to RMS  level  (note:
2817              not in dB).
2818
2819              Flat factor  is a measure of the flatness (i.e. consecutive sam‐
2820              ples with the same value) of the signal at its peak levels (i.e.
2821              either  Min level,  or  Max level).   Pk count  is the number of
2822              occasions (not the number of samples) that the  signal  attained
2823              either Min level, or Max level.
2824
2825              The  right-hand  Bit-depth  figure is the standard definition of
2826              bit-depth i.e. bits less significant than the given  number  are
2827              fixed  at zero.  The left-hand figure is the number of most sig‐
2828              nificant bits that are fixed at zero (or one for  negative  num‐
2829              bers)  subtracted  from  the  right-hand figure (the number sub‐
2830              tracted is directly related to Pk lev dB).
2831
2832              For multi-channel audio, an overall figure for each of the above
2833              measurements  is  given  and derived from the channel figures as
2834              follows: DC offset:  maximum  magnitude;  Max level,  Pk lev dB,
2835              RMS Pk dB,  Bit-depth:  maximum;  Min level, RMS Tr dB: minimum;
2836              RMS lev dB, Flat factor, Pk count:  average;  Crest factor:  not
2837              applicable.
2838
2839              Length s  is  the duration in seconds of the audio, and Num sam‐
2840              ples  is  equal  to  the  sample-rate  multiplied   by   Length.
2841              Scale Max  is  the  scaling  applied to the first three measure‐
2842              ments; specifically, it is the maximum value that could apply to
2843              Max level.   Window s  is  the length of the window used for the
2844              peak and trough RMS measurements.
2845
2846              See also the stat effect.
2847
2848       swap   Swap stereo channels.  If the input  is  not  stereo,  pairs  of
2849              channels  are  swapped,  and  a possible odd last channel passed
2850              through.  E.g., for seven channels, the output order will be  2,
2851              1, 4, 3, 6, 5, 7.
2852
2853              See  also  remix  for  an  effect  that allows arbitrary channel
2854              selection and ordering (and mixing).
2855
2856       stretch factor [window fade shift fading]
2857              Change the audio duration (but not its pitch).  This  effect  is
2858              broadly  equivalent  to  the  tempo effect with (factor inverted
2859              and) search set to zero, so in general, its results are compara‐
2860              tively  poor;  it  is  retained  as it can sometimes out-perform
2861              tempo for small factors.
2862
2863              factor of stretching: >1 lengthen, <1 shorten duration.   window
2864              size is in ms.  Default is 20ms.  The fade option, can be `lin'.
2865              shift ratio, in [0 1].  Default depends on stretch factor. 1  to
2866              shorten,  0.8  to  lengthen.  The fading ratio, in [0 0.5].  The
2867              amount of a fade's default depends on factor and shift.
2868
2869              See also the tempo effect.
2870
2871       synth [-j KEY] [-n] [len [off [ph [p1 [p2 [p3]]]]]] {[type] [combine]
2872       [[%]freq[k][:|+|/|-[%]freq2[k]]] [off [ph [p1 [p2 [p3]]]]]}
2873              This  effect  can  be  used to generate fixed or swept frequency
2874              audio tones with various wave shapes, or to  generate  wide-band
2875              noise  of various `colours'.  Multiple synth effects can be cas‐
2876              caded to produce more complex waveforms; at  each  stage  it  is
2877              possible  to choose whether the generated waveform will be mixed
2878              with, or modulated onto the  output  from  the  previous  stage.
2879              Audio for each channel in a multi-channel audio file can be syn‐
2880              thesised independently.
2881
2882              Though this effect is used to generate audio, an input file must
2883              still be given, the characteristics of which will be used to set
2884              the synthesised audio length, the number of  channels,  and  the
2885              sampling rate; however, since the input file's audio is not nor‐
2886              mally needed, a `null file' (with the special name -n) is  often
2887              given  instead (and the length specified as a parameter to synth
2888              or by another given effect that has an associated length).
2889
2890              For example, the following produces a  3  second,  48kHz,  audio
2891              file containing a sine-wave swept from 300 to 3300 Hz:
2892                 sox -n output.wav synth 3 sine 300-3300
2893              and this produces an 8 kHz version:
2894                 sox -r 8000 -n output.wav synth 3 sine 300-3300
2895              Multiple  channels  can  be synthesised by specifying the set of
2896              parameters shown between braces multiple  times;  the  following
2897              puts  the  swept tone in the left channel and adds `brown' noise
2898              in the right:
2899                 sox -n output.wav synth 3 sine 300-3300 brownnoise
2900              The following example shows how two synth effects  can  be  cas‐
2901              caded to create a more complex waveform:
2902                 play -n synth 0.5 sine 200-500 synth 0.5 sine fmod 700-100
2903              Frequencies can also be given in `scientific' note notation, or,
2904              by prefixing a `%' character, as a number of semitones  relative
2905              to  `middle  A'  (440 Hz).   For example, the following could be
2906              used to help tune a guitar's low `E' string:
2907                 play -n synth 4 pluck %-29
2908              or with a (Bourne shell) loop, the whole guitar:
2909                 for n in E2 A2 D3 G3 B3 E4; do
2910                   play -n synth 4 pluck $n repeat 2; done
2911              See the delay effect (above) and the reference to `SoX scripting
2912              examples' (below) for more synth examples.
2913
2914              N.B.   This  effect  generates  audio at maximum volume (0dBFS),
2915              which means that there is a high chance of clipping  when  using
2916              the  audio subsequently, so in many cases, you will want to fol‐
2917              low this effect with the gain effect to prevent this  from  hap‐
2918              pening.  (See  also Clipping above.)  Note that, by default, the
2919              synth effect incorporates the functionality of gain -h (see  the
2920              gain effect for details); synth's -n option may be given to dis‐
2921              able this behaviour.
2922
2923              A detailed description of each synth parameter follows:
2924
2925              len is the length of audio to synthesise  (any  time  specifica‐
2926              tion);  a value of 0 indicated to use the input length, which is
2927              also the default.
2928
2929              type is one of sine, square, triangle, sawtooth, trapezium, exp,
2930              [white]noise,    tpdfnoise,    pinknoise,   brownnoise,   pluck;
2931              default=sine.
2932
2933              combine is one of create, mix, amod (amplitude modulation), fmod
2934              (frequency modulation); default=create.
2935
2936              freq/freq2 are the frequencies at the beginning/end of synthesis
2937              in Hz  or,  if  preceded  with  `%',  semitones  relative  to  A
2938              (440 Hz);  alternatively,  `scientific'  note notation (e.g. E2)
2939              may be used.  The default frequency is 440Hz.  By  default,  the
2940              tuning  used with the note notations is `equal temperament'; the
2941              -j KEY option selects `just intonation', where KEY is an integer
2942              number  of  semitones  relative  to  A  (so for example, -9 or 3
2943              selects the key of C), or a note in scientific notation.
2944
2945              If freq2 is given, then len must also have been  given  and  the
2946              generated tone will be swept between the given frequencies.  The
2947              two given frequencies must be separated by one of the characters
2948              `:',  `+',  `/',  or `-'.  This character is used to specify the
2949              sweep function as follows:
2950
2951              :      Linear: the tone will change by a fixed number  of  hertz
2952                     per second.
2953
2954              +      Square:  a  second-order  function  is used to change the
2955                     tone.
2956
2957              /      Exponential: the tone will change by a  fixed  number  of
2958                     semitones per second.
2959
2960              -      Exponential:  as  `/', but initial phase always zero, and
2961                     stepped (less smooth) frequency changes.
2962
2963              Not used for noise.
2964
2965              off is the bias (DC-offset) of the signal in percent; default=0.
2966
2967              ph is the phase shift in percentage of 1 cycle; default=0.   Not
2968              used for noise.
2969
2970              p1  is  the  percentage  of each cycle that is `on' (square), or
2971              `rising' (triangle, exp, trapezium); default=50 (square,  trian‐
2972              gle,   exp),   default=10   (trapezium),   or  sustain  (pluck);
2973              default=40.
2974
2975              p2 (trapezium): the  percentage  through  each  cycle  at  which
2976              `falling' begins; default=50. exp: the amplitude in multiples of
2977              2dB; default=50, or tone-1 (pluck); default=20.
2978
2979              p3 (trapezium): the  percentage  through  each  cycle  at  which
2980              `falling' ends; default=60, or tone-2 (pluck); default=90.
2981
2982       tempo [-q] [-m|-s|-l] factor [segment [search [overlap]]]
2983              Change  the  audio playback speed but not its pitch. This effect
2984              uses the WSOLA algorithm. The audio is chopped up into  segments
2985              which are then shifted in the time domain and overlapped (cross-
2986              faded) at points where  their  waveforms  are  most  similar  as
2987              determined by measurement of `least squares'.
2988
2989              By  default,  linear searches are used to find the best overlap‐
2990              ping points.  If  the  optional  -q  parameter  is  given,  tree
2991              searches  are  used  instead.  This  makes  the effect work more
2992              quickly, but the result may not sound as good. However,  if  you
2993              must  improve  the  processing speed, this generally reduces the
2994              sound quality less than reducing the search or overlap values.
2995
2996              The -m option is used to optimize  default  values  of  segment,
2997              search and overlap for music processing.
2998
2999              The  -s  option  is  used to optimize default values of segment,
3000              search and overlap for speech processing.
3001
3002              The -l option is used to optimize  default  values  of  segment,
3003              search  and  overlap for `linear' processing that tends to cause
3004              more noticeable distortion but may  be  useful  when  factor  is
3005              close to 1.
3006
3007              If -m, -s, or -l is specified, the default value of segment will
3008              be calculated based on factor, while default search and  overlap
3009              values  are based on segment. Any values you provide still over‐
3010              ride these default values.
3011
3012              factor gives the ratio of new tempo to the old  tempo,  so  e.g.
3013              1.1 speeds up the tempo by 10%, and 0.9 slows it down by 10%.
3014
3015              The  optional  segment parameter selects the algorithm's segment
3016              size in milliseconds.  If no  other  flags  are  specified,  the
3017              default  value  is  82  and  is typically suited to making small
3018              changes to the tempo of music. For larger changes (e.g. a factor
3019              of 2), 41 ms may give a better result.  The -m, -s, and -l flags
3020              will cause the segment  default  to  be  automatically  adjusted
3021              based on factor.  For example using -s (for speech) with a tempo
3022              of 1.25 will calculate a default segment value of 32.
3023
3024              The optional search parameter gives the  audio  length  in  mil‐
3025              liseconds  over  which the algorithm will search for overlapping
3026              points.  If no other flags are specified, the default  value  is
3027              14.68.   Larger  values  use more processing time and may or may
3028              not produce better results.  A practical  maximum  is  half  the
3029              value  of  segment. Search can be reduced to cut processing time
3030              at the risk of degrading output quality.  The  -m,  -s,  and  -l
3031              flags will cause the search default to be automatically adjusted
3032              based on segment.
3033
3034              The optional overlap parameter gives the segment overlap  length
3035              in  milliseconds.   Default value is 12, but -m, -s, or -l flags
3036              automatically adjust overlap based on segment  size.  Increasing
3037              overlap  increases  processing  time and may increase quality. A
3038              practical maximum for overlap is the value of search, with over‐
3039              lap typically being (at least) a little smaller then search.
3040
3041              See  also  speed  for  an  effect  that  changes tempo and pitch
3042              together, pitch and bend for effects that change pitch only, and
3043              stretch for an effect that changes tempo using a different algo‐
3044              rithm.
3045
3046       treble gain [frequency[k] [width[s|h|k|o|q]]]
3047              Apply a treble tone-control effect.  See the description of  the
3048              bass effect for details.
3049
3050       tremolo speed [depth]
3051              Apply  a  tremolo (low frequency amplitude modulation) effect to
3052              the audio.  The tremolo frequency in Hz is given by  speed,  and
3053              the depth as a percentage by depth (default 40).
3054
3055       trim {position(+)}
3056              Cuts  portions out of the audio.  Any number of positions may be
3057              given; audio is not sent to the output until the first  position
3058              is reached.  The effect then alternates between copying and dis‐
3059              carding audio at each position.  Using a  value  of  0  for  the
3060              first  position  parameter  allows copying from the beginning of
3061              the audio.
3062
3063              For example,
3064                 sox infile outfile trim 0 10
3065              will copy the first ten seconds, while
3066                 play infile trim 12:34 =15:00 -2:00
3067              and
3068                 play infile trim 12:34 2:26 -2:00
3069              will both play from 12 minutes 34 seconds into the audio  up  to
3070              15  minutes into the audio (i.e. 2 minutes and 26 seconds long),
3071              then resume playing two minutes before the end of audio.
3072
3073       upsample [factor]
3074              Upsample the signal by an integer  factor:  factor-1  zero-value
3075              samples  are  inserted between each pair of input samples.  As a
3076              result, the original spectrum is replicated into  the  new  fre‐
3077              quency  space (imaging) and attenuated.  This attenuation can be
3078              compensated for by adding vol factor after any further  process‐
3079              ing.   The upsample effect is typically used in combination with
3080              filtering effects.
3081
3082              For a general resampling effect  with  anti-imaging,  see  rate.
3083              See also downsample.
3084
3085       vad [options]
3086              Voice  Activity  Detector.   Attempts  to trim silence and quiet
3087              background sounds from the ends of (fairly high resolution  i.e.
3088              16-bit, 44-48kHz) recordings of speech.  The algorithm currently
3089              uses a simple cepstral power measurement to detect voice, so may
3090              be  fooled  by  other  things, especially music.  The effect can
3091              trim only from the front of the audio, so in order to trim  from
3092              the back, the reverse effect must also be used.  E.g.
3093                 play speech.wav norm vad
3094              to trim from the front,
3095                 play speech.wav norm reverse vad reverse
3096              to trim from the back, and
3097                 play speech.wav norm vad reverse vad reverse
3098              to  trim  from  both ends.  The use of the norm effect is recom‐
3099              mended, but remember that neither reverse nor norm  is  suitable
3100              for use with streamed audio.
3101
3102              Options:
3103              Default values are shown in parenthesis.
3104
3105              -t [22mnum (7)
3106                     The measurement level used to trigger activity detection.
3107                     This might need to be  changed  depending  on  the  noise
3108                     level,  signal level and other charactistics of the input
3109                     audio.
3110
3111              -T num (0.25)
3112                     The time constant (in seconds) used to help ignore  short
3113                     bursts of sound.
3114
3115              -s [22mnum (1)
3116                     The  amount  of  audio  (in  seconds)  to search for qui‐
3117                     eter/shorter bursts of audio  to  include  prior  to  the
3118                     detected trigger point.
3119
3120              -g num (0.25)
3121                     Allowed  gap  (in seconds) between quieter/shorter bursts
3122                     of audio to include prior to the detected trigger point.
3123
3124              -p [22mnum (0)
3125                     The amount of audio (in seconds) to preserve  before  the
3126                     trigger point and any found quieter/shorter bursts.
3127
3128              Advanced Options:
3129              These allow fine tuning of the algorithm's internal parameters.
3130
3131              -b num The  algorithm  (internally)  uses adaptive noise estima‐
3132                     tion/reduction in order to detect the start of the wanted
3133                     audio.   This  option sets the time for the initial noise
3134                     estimate.
3135
3136              -N num Time constant used by the adaptive  noise  estimator  for
3137                     when the noise level is increasing.
3138
3139              -n num Time  constant  used  by the adaptive noise estimator for
3140                     when the noise level is decreasing.
3141
3142              -r num Amount of noise reduction to use in the  detection  algo‐
3143                     rithm (e.g. 0, 0.5, ...).
3144
3145              -f num Frequency of the algorithm's processing/measurements.
3146
3147              -m num Measurement  duration;  by default, twice the measurement
3148                     period; i.e.  with overlap.
3149
3150              -M num Time constant used to smooth spectral measurements.
3151
3152              -h num `Brick-wall' frequency of high-pass filter applied at the
3153                     input to the detector algorithm.
3154
3155              -l num `Brick-wall'  frequency of low-pass filter applied at the
3156                     input to the detector algorithm.
3157
3158              -H num `Brick-wall' frequency of high-pass lifter  used  in  the
3159                     detector algorithm.
3160
3161              -L num `Brick-wall'  frequency  of  low-pass  lifter used in the
3162                     detector algorithm.
3163
3164              See also the silence effect.
3165
3166       vol gain [type [limitergain]]
3167              Apply an amplification or an attenuation to  the  audio  signal.
3168              Unlike the -v option (which is used for balancing multiple input
3169              files as they enter the SoX effects processing chain), vol is an
3170              effect  like  any  other so can be applied anywhere, and several
3171              times if necessary, during the processing chain.
3172
3173              The amount to change the volume is given by gain which is inter‐
3174              preted,  according  to  the  given  type, as follows: if type is
3175              amplitude (or is omitted), then gain is an amplitude (i.e. volt‐
3176              age  or  linear)  ratio, if power, then a power (i.e. wattage or
3177              voltage-squared) ratio, and if dB, then a power change in dB.
3178
3179              When type is amplitude or power, a gain of 1 leaves  the  volume
3180              unchanged,  less  than  1  decreases  it,  and  greater  than  1
3181              increases it; a negative gain inverts the audio signal in  addi‐
3182              tion to adjusting its volume.
3183
3184              When  type  is dB, a gain of 0 leaves the volume unchanged, less
3185              than 0 decreases it, and greater than 0 increases it.
3186
3187              See [4] for a detailed discussion on electrical (and hence audio
3188              signal) voltage and power ratios.
3189
3190              Beware of Clipping when the increasing the volume.
3191
3192              The gain and the type parameters can be concatenated if desired,
3193              e.g.  vol 10dB.
3194
3195              An optional limitergain value can be specified and should  be  a
3196              value  much  less than 1 (e.g. 0.05 or 0.02) and is used only on
3197              peaks to prevent clipping.  Not specifying this  parameter  will
3198              cause  no limiter to be used.  In verbose mode, this effect will
3199              display the percentage of the audio that needed to be limited.
3200
3201              See also gain for a volume-changing effect with different  capa‐
3202              bilities,  and  compand  for  a dynamic-range compression/expan‐
3203              sion/limiting effect.
3204

DIAGNOSTICS

3206       Exit status is 0 for no error, 1 if there is a problem  with  the  com‐
3207       mand-line parameters, or 2 if an error occurs during file processing.
3208

BUGS

3210       Please report any bugs found in this version of SoX to the mailing list
3211       (sox-users@lists.sourceforge.net).
3212

LICENSE

3238       Copyright 1998-2013 Chris Bagwell and SoX Contributors.
3239       Copyright 1991 Lance Norskog and Sundry Contributors.
3240
3241       This program is free software; you can redistribute it and/or modify it
3242       under the terms of the GNU General Public License as published  by  the
3243       Free  Software  Foundation;  either  version 2, or (at your option) any
3244       later version.
3245
3246       This program is distributed in the hope that it  will  be  useful,  but
3247       WITHOUT  ANY  WARRANTY;  without  even  the  implied  warranty  of MER‐
3248       CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU  General
3249       Public License for more details.
3250

AUTHORS

3252       Chris Bagwell (cbagwell@users.sourceforge.net).  Other authors and con‐
3253       tributors are listed in the ChangeLog file that is distributed with the
3254       source code.
3255
3256
3257
3258sox                            December 31, 2014                        SoX(1)