1SoX(1)                          Sound eXchange                          SoX(1)
2
3
4

NAME

6       SoX - Sound eXchange, the Swiss Army knife of audio manipulation
7

SYNOPSIS

9       sox [global-options] [format-options] infile1
10            [[format-options] infile2] ... [format-options] outfile
11            [effect [effect-options]] ...
12
13       play [global-options] [format-options] infile1
14            [[format-options] infile2] ... [format-options]
15            [effect [effect-options]] ...
16
17       rec [global-options] [format-options] outfile
18            [effect [effect-options]] ...
19

DESCRIPTION

21   Introduction
22       SoX  reads  and  writes  audio  files  in  most popular formats and can
23       optionally apply  effects  to  them.  It  can  combine  multiple  input
24       sources,  synthesise audio, and, on many systems, act as a general pur‐
25       pose audio player or a multi-track audio recorder. It also has  limited
26       ability to split the input into multiple output files.
27
28       All SoX functionality is available using just the sox command.  To sim‐
29       plify playing and recording audio, if SoX is invoked as play, the  out‐
30       put  file  is  automatically set to be the default sound device, and if
31       invoked as rec, the default sound device is used as  an  input  source.
32       Additionally,  the  soxi(1)  command  provides a convenient way to just
33       query audio file header information.
34
35       The heart of SoX is a  library  called  libSoX.   Those  interested  in
36       extending  SoX or using it in other programs should refer to the libSoX
37       manual page: libsox(3).
38
39       SoX is a command-line audio processing  tool,  particularly  suited  to
40       making  quick,  simple  edits  and to batch processing.  If you need an
41       interactive, graphical audio editor, use audacity(1).
42
43                                 *        *        *
44
45       The overall SoX processing chain can be summarised as follows:
46
47                      Input(s) → Combiner → Effects → Output(s)
48
49       Note however, that on the SoX command line, the positions of  the  Out‐
50       put(s)  and the Effects are swapped w.r.t. the logical flow just shown.
51       Note also that whilst options pertaining to  files  are  placed  before
52       their  respective file name, the opposite is true for effects.  To show
53       how this works in practice, here is a selection of examples of how  SoX
54       might be used.  The simple
55          sox recital.au recital.wav
56       translates  an  audio  file  in  Sun AU format to a Microsoft WAV file,
57       whilst
58          sox recital.au -b 16 recital.wav channels 1 rate 16k fade 3 norm
59       performs the same format translation, but  also  applies  four  effects
60       (down-mix  to  one channel, sample rate change, fade-in, nomalize), and
61       stores the result at a bit-depth of 16.
62          sox -r 16k -e signed -b 8 -c 1 voice-memo.raw voice-memo.wav
63       converts `raw' (a.k.a. `headerless') audio to  a  self-describing  file
64       format,
65          sox slow.aiff fixed.aiff speed 1.027
66       adjusts audio speed,
67          sox short.wav long.wav longer.wav
68       concatenates two audio files, and
69          sox -m music.mp3 voice.wav mixed.flac
70       mixes together two audio files.
71          play "The Moonbeams/Greatest/*.ogg" bass +3
72       plays  a  collection  of  audio  files  whilst applying a bass boosting
73       effect,
74          play -n -c1 synth sin %-12 sin %-9 sin %-5 sin %-2 fade h 0.1 1 0.1
75       plays a synthesised `A minor seventh' chord with a pipe-organ sound,
76          rec -c 2 radio.aiff trim 0 30:00
77       records half an hour of stereo audio, and
78          play -q take1.aiff & rec -M take1.aiff take1-dub.aiff
79       (with POSIX shell and where supported by hardware) records a new  track
80       in a multi-track recording.  Finally,
81          rec -r 44100 -b 16 -s -p silence 1 0.50 0.1% 1 10:00 0.1% | \
82            sox -p song.ogg silence 1 0.50 0.1% 1 2.0 0.1% : \
83            newfile : restart
84       records a stream of audio such as LP/cassette and splits in to multiple
85       audio files at points with 2 seconds of silence.   Also,  it  does  not
86       start  recording  until  it detects audio is playing and stops after it
87       sees 10 minutes of silence.
88
89       N.B.  The above is just an overview  of  SoX's  capabilities;  detailed
90       explanations  of  how  to  use  all  SoX  parameters, file formats, and
91       effects can be found below in this  manual,  in  soxformat(7),  and  in
92       soxi(1).
93
94   File Format Types
95       SoX  can  work  with  `self-describing'  and `raw' audio files.  `self-
96       describing' formats (e.g. WAV, FLAC, MP3) have a header that completely
97       describes  the  signal  and  encoding attributes of the audio data that
98       follows. `raw' or `headerless' formats do not contain this information,
99       so the audio characteristics of these must be described on the SoX com‐
100       mand line or inferred from those of the input file.
101
102       The following four characteristics are used to describe the  format  of
103       audio data such that it can be processed with SoX:
104
105       sample rate
106              The  sample rate in samples per second (`Hertz' or `Hz').  Digi‐
107              tal telephony  traditionally  uses  a  sample  rate  of  8000 Hz
108              (8 kHz), though these days, 16 and even 32 kHz are becoming more
109              common. Audio Compact Discs  use  44100 Hz  (44.1 kHz).  Digital
110              Audio  Tape  and  many computer systems use 48 kHz. Professional
111              audio systems often use 96 kHz.
112
113       sample size
114              The number of bits used to store each sample.  Today, 16-bit  is
115              commonly  used.  8-bit was popular in the early days of computer
116              audio. 24-bit is used in the  professional  audio  arena.  Other
117              sizes are also used.
118
119       data encoding
120              The   way   in  which  each  audio  sample  is  represented  (or
121              `encoded').  Some encodings have variants with  different  byte-
122              orderings  or  bit-orderings.   Some  compress the audio data so
123              that the stored audio data takes up less space (i.e. disk  space
124              or  transmission bandwidth) than the other format parameters and
125              the number of samples would imply.  Commonly-used encoding types
126              include  floating-point,  μ-law, ADPCM, signed-integer PCM, MP3,
127              and FLAC.
128
129       channels
130              The number  of  audio  channels  contained  in  the  file.   One
131              (`mono')  and  two (`stereo') are widely used.  `Surround sound'
132              audio typically contains six or more channels.
133
134       The term `bit-rate' is a measure of the amount of storage  occupied  by
135       an  encoded  audio signal over a unit of time.  It can depend on all of
136       the above and is typically denoted as a number of kilo-bits per  second
137       (kbps).    An  A-law  telephony  signal  has  a  bit-rate  of  64  kbs.
138       MP3-encoded stereo music typically has  a  bit-rate  of  128-196  kbps.
139       FLAC-encoded stereo music typically has a bit-rate of 550-760 kbps.
140
141       Most self-describing formats also allow textual `comments' to be embed‐
142       ded in the file that can be used to describe the  audio  in  some  way,
143       e.g. for music, the title, the author, etc.
144
145       One  important  use  of  audio file comments is to convey `Replay Gain'
146       information.  SoX supports applying Replay Gain  information,  but  not
147       generating it.  Note that by default, SoX copies input file comments to
148       output files that support comments, so output files may contain  Replay
149       Gain  information if some was present in the input file.  In this case,
150       if anything other than a simple format conversion  was  performed  then
151       the  output  file Replay Gain information is likely to be incorrect and
152       so should be recalculated using a tool that supports this (not SoX).
153
154       The soxi(1) command can be used to display information from audio  file
155       headers.
156
157   Determining & Setting The File Format
158       There  are  several mechanisms available for SoX to use to determine or
159       set the format characteristics of an audio file.  Depending on the cir‐
160       cumstances,  individual  characteristics may be determined or set using
161       different mechanisms.
162
163       To determine the format of an input file, SoX will  use,  in  order  of
164       precedence and as given or available:
165
166       1.  Command-line format options.
167
168       2.  The contents of the file header.
169
170       3.  The filename extension.
171
172       To set the output file format, SoX will use, in order of precedence and
173       as given or available:
174
175       1.  Command-line format options.
176
177       2.  The filename extension.
178
179       3.  The input file format characteristics, or the closest that is  sup‐
180           ported by the output file type.
181
182       For  all  files, SoX will exit with an error if the file type cannot be
183       determined. Command-line format options may need to be added or changed
184       to resolve the problem.
185
186   Playing & Recording Audio
187       The  play  and  rec  commands  are  provided  so that basic playing and
188       recording is as simple as
189          play existing-file.wav
190       and
191          rec new-file.wav
192       These two commands are functionally equivalent to
193          sox existing-file.wav -d
194       and
195          sox -d new-file.wav
196       Of course, further options and effects  (as  described  below)  can  be
197       added to the commands in either form.
198
199                                 *        *        *
200
201       Some  systems  provide  more  than  one  type of (SoX-compatible) audio
202       driver, e.g. ALSA & OSS, or SUNAU & AO.  Systems  can  also  have  more
203       than  one  audio  device (a.k.a. `sound card').  If more than one audio
204       driver has been built-in to SoX, and the default selected by  SoX  when
205       recording  or  playing  is  not the one that is wanted, then the AUDIO‐
206       DRIVER environment variable can be used to override the  default.   For
207       example (on many systems):
208          set AUDIODRIVER=oss
209          play ...
210       The  AUDIODEV  environment variable can be used to override the default
211       audio device, e.g.
212          set AUDIODEV=/dev/dsp2
213          play ...
214          sox ... -t oss
215       or
216          set AUDIODEV=hw:soundwave,1,2
217          play ...
218          sox ... -t alsa
219       Note that the way of setting environment variables varies  from  system
220       to system - for some specific examples, see `SOX_OPTS' below.
221
222       When  playing  a  file  with a sample rate that is not supported by the
223       audio output device, SoX will automatically invoke the rate  effect  to
224       perform  the  necessary sample rate conversion.  For compatibility with
225       old hardware, the default rate quality level is set to `low'. This  can
226       be  changed  by  explicitly specifying the rate effect with a different
227       quality level, e.g.
228          play ... rate -m
229       or by using the --play-rate-arg option (see below).
230
231                                 *        *        *
232
233       On some systems, SoX allows audio playback volume to be adjusted whilst
234       using play.  Where supported, this is achieved by tapping the `v' & `V'
235       keys during playback.
236
237       To help with setting a suitable recording level, SoX includes  a  peak-
238       level  meter  which can be invoked (before making the actual recording)
239       as follows:
240          rec -n
241       The recording level should be adjusted (using the system-provided mixer
242       program, not SoX) so that the meter is at most occasionally full scale,
243       and never `in the red' (an exclamation mark is  shown).   See  also  -S
244       below.
245
246   Accuracy
247       Many  file formats that compress audio discard some of the audio signal
248       information whilst doing so. Converting to such a format and then  con‐
249       verting  back  again  will  not  produce  an exact copy of the original
250       audio.  This is the case for many formats used in telephony  (e.g.   A-
251       law,  GSM) where low signal bandwidth is more important than high audio
252       fidelity, and for many formats used in  portable  music  players  (e.g.
253       MP3,  Vorbis)  where  adequate  fidelity  can be retained even with the
254       large compression ratios that are needed to make portable players prac‐
255       tical.
256
257       Formats that discard audio signal information are called `lossy'.  For‐
258       mats that do not are called `lossless'.  The term `quality' is used  as
259       a  measure  of  how closely the original audio signal can be reproduced
260       when using a lossy format.
261
262       Audio file conversion with SoX is lossless when it can  be,  i.e.  when
263       not  using  lossy  compression,  when not reducing the sampling rate or
264       number of channels, and when the number of bits used in the destination
265       format is not less than in the source format.  E.g.  converting from an
266       8-bit PCM format to a 16-bit PCM format is lossless but converting from
267       an 8-bit PCM format to (8-bit) A-law isn't.
268
269       N.B.   SoX  converts all audio files to an internal uncompressed format
270       before performing any audio processing. This means that manipulating  a
271       file that is stored in a lossy format can cause further losses in audio
272       fidelity.  E.g. with
273          sox long.mp3 short.mp3 trim 10
274       SoX first decompresses the  input  MP3  file,  then  applies  the  trim
275       effect,  and  finally creates the output MP3 file by re-compressing the
276       audio - with a possible reduction in fidelity above that which occurred
277       when  the input file was created.  Hence, if what is ultimately desired
278       is lossily compressed audio, it is highly recommended  to  perform  all
279       audio  processing  using  lossless file formats and then convert to the
280       lossy format only at the final stage.
281
282       N.B.  Applying multiple effects with a single SoX invocation  will,  in
283       general, produce more accurate results than those produced using multi‐
284       ple SoX invocations.
285
286   Dithering
287       Dithering is a technique used to maximise the dynamic  range  of  audio
288       stored  at a particular bit-depth. Any distortion introduced by quanti‐
289       sation is decorrelated by adding a small amount of white noise  to  the
290       signal.  In most cases, SoX can determine whether the selected process‐
291       ing requires dither and will add it during output formatting if  appro‐
292       priate.
293
294       Specifically,  by  default, SoX automatically adds TPDF dither when the
295       output bit-depth is less than 24 and any of the following are true:
296
297       ·   bit-depth reduction has been specified explicitly using a  command-
298           line option
299
300       ·   the  output file format supports only bit-depths lower than that of
301           the input file format
302
303       ·   an effect has increased effective  bit-depth  within  the  internal
304           processing chain
305
306       For  example,  adjusting  volume  with vol 0.25 requires two additional
307       bits in which to losslessly  store  its  results  (since  0.25  decimal
308       equals  0.01 binary).  So if the input file bit-depth is 16, then SoX's
309       internal representation will utilise 18 bits after processing this vol‐
310       ume  change.   In  order  to  store the output at the same depth as the
311       input, dithering is used to remove the additional bits.
312
313       Use the -V option to see what processing SoX has  automatically  added.
314       The  -D option may be given to override automatic dithering.  To invoke
315       dithering manually (e.g. to select  a  noise-shaping  curve),  see  the
316       dither effect.
317
318   Clipping
319       Clipping is distortion that occurs when an audio signal level (or `vol‐
320       ume') exceeds the range of the chosen representation.  In  most  cases,
321       clipping  is  undesirable  and  so should be corrected by adjusting the
322       level prior to the point (in the processing chain) at which it occurs.
323
324       In SoX, clipping could occur, as you might expect, when using  the  vol
325       or gain effects to increase the audio volume. Clipping could also occur
326       with many other effects, when converting one  format  to  another,  and
327       even when simply playing the audio.
328
329       Playing an audio file often involves resampling, and processing by ana‐
330       logue components can introduce a small DC offset and/or  amplification,
331       all  of which can produce distortion if the audio signal level was ini‐
332       tially too close to the clipping point.
333
334       For these reasons, it is usual to make sure that an audio file's signal
335       level  has  some `headroom', i.e. it does not exceed a particular level
336       below the maximum possible level for the  given  representation.   Some
337       standards  bodies recommend as much as 9dB headroom, but in most cases,
338       3dB (≈ 70% linear) is enough.  Note that this wisdom seems to have been
339       lost in modern music production; in fact, many CDs, MP3s, etc.  are now
340       mastered at levels above 0dBFS i.e. the audio is clipped as delivered.
341
342       SoX's stat and stats effects can assist in determining the signal level
343       in  an  audio file. The gain or vol effect can be used to prevent clip‐
344       ping, e.g.
345          sox dull.wav bright.wav gain -6 treble +6
346       guarantees that the treble boost will not clip.
347
348       If clipping occurs at any point during processing, SoX will  display  a
349       warning message to that effect.
350
351       See also -G and the gain and norm effects.
352
353   Input File Combining
354       SoX's  input  combiner can be configured (see OPTIONS below) to combine
355       multiple files using  any  of  the  following  methods:  `concatenate',
356       `sequence',  `mix',  `mix-power',  `merge', or `multiply'.  The default
357       method is `sequence' for play, and `concatenate' for rec and sox.
358
359       For all methods other than `sequence', multiple input files  must  have
360       the  same  sampling rate. If necessary, separate SoX invocations can be
361       used to make sampling rate adjustments prior to combining.
362
363       If the `concatenate' combining method is selected (usually,  this  will
364       be  by  default) then the input files must also have the same number of
365       channels.  The audio from each input will be concatenated in the  order
366       given to form the output file.
367
368       The `sequence' combining method is selected automatically for play.  It
369       is similar to `concatenate' in that the audio from each input  file  is
370       sent  serially to the output file. However, here the output file may be
371       closed and reopened  at  the  corresponding  transition  between  input
372       files.  This may be just what is needed when sending different types of
373       audio to an output device, but is not generally useful when the  output
374       is a normal file.
375
376       If  either  the  `mix' or `mix-power' combining method is selected then
377       two or more input files must be given and will  be  mixed  together  to
378       form  the  output file.  The number of channels in each input file need
379       not be the same, but SoX will issue a warning if they are not and  some
380       channels  in  the  output  file will not contain audio from every input
381       file.  A mixed audio file cannot be un-mixed without reference  to  the
382       original input files.
383
384       If  the  `merge'  combining  method  is selected then two or more input
385       files must be given and will be merged  together  to  form  the  output
386       file.   The number of channels in each input file need not be the same.
387       A merged audio file comprises all of the channels from all of the input
388       files.  Un-merging  is  possible using multiple invocations of SoX with
389       the remix effect.  For example, two mono files could be merged to  form
390       one  stereo file. The first and second mono files would become the left
391       and right channels of the stereo file.
392
393       The `multiply' combining method multiplies the sample values of  corre‐
394       sponding  channels  (treated  as numbers in the interval -1 to +1).  If
395       the number of channels in the input files is not the same, the  missing
396       channels are considered to contain all zero.
397
398       When  combining input files, SoX applies any specified effects (includ‐
399       ing, for example, the vol volume adjustment effect) after the audio has
400       been combined. However, it is often useful to be able to set the volume
401       of (i.e. `balance') the inputs  individually,  before  combining  takes
402       place.
403
404       For  all  combining  methods, input file volume adjustments can be made
405       manually using the -v option (below) which can be given for one or more
406       input  files.  If it is given for only some of the input files then the
407       others receive no volume adjustment.  In some circumstances,  automatic
408       volume adjustments may be applied (see below).
409
410       The -V option (below) can be used to show the input file volume adjust‐
411       ments that have been selected (either manually or automatically).
412
413       There are some special considerations that need  to  made  when  mixing
414       input files:
415
416       Unlike  the  other  methods, `mix' combining has the potential to cause
417       clipping in the combiner if no balancing is performed.  In  this  case,
418       if manual volume adjustments are not given, SoX will try to ensure that
419       clipping does not occur by automatically adjusting the  volume  (ampli‐
420       tude) of each input signal by a factor of ¹/n, where n is the number of
421       input files.  If this results in audio that is too quiet  or  otherwise
422       unbalanced then the input file volumes can be set manually as described
423       above. Using the norm effect on the mix is another alternative.
424
425       If mixed audio seems loud enough at some points but too quiet in others
426       then  dynamic range compression should be applied to correct this - see
427       the compand effect.
428
429       With the `mix-power' combine method, the mixed volume is  approximately
430       equal to that of one of the input signals.  This is achieved by balanc‐
431       ing using a factor of ¹/√n instead of ¹/n.  Note  that  this  balancing
432       factor  does not guarantee that clipping will not occur, but the number
433       of clips will usually be low and the resultant distortion is  generally
434       imperceptible.
435
436   Output Files
437       SoX's  default  behaviour  is to take one or more input files and write
438       them to a single output file.
439
440       This behaviour can be changed by specifying the pseudo-effect `newfile'
441       within the effects list.  SoX will then enter multiple output mode.
442
443       In  multiple  output mode, a new file is created when the effects prior
444       to the `newfile' indicate they are  done.   The  effects  chain  listed
445       after  `newfile'  is then started up and its output is saved to the new
446       file.
447
448       In multiple output mode, a unique number will automatically be appended
449       to the end of all filenames.  If the filename has an extension then the
450       number is inserted before the extension.  This behaviour can be custom‐
451       ized  by  placing a %n anywhere in the filename where the number should
452       be substituted.  An optional number can be placed after the % to  indi‐
453       cate a minimum fixed width for the number.
454
455       Multiple output mode is not very useful unless an effect that will stop
456       the effects chain early is specified before the `newfile'.  If  end  of
457       file  is reached before the effects chain stops itself then no new file
458       will be created as it would be empty.
459
460       The following is an example of splitting the first  60  seconds  of  an
461       input file into two 30 second files and ignoring the rest.
462          sox song.wav ringtone%1n.wav trim 0 30 : newfile : trim 0 30
463
464   Stopping SoX
465       Usually SoX will complete its processing and exit automatically once it
466       has read all available audio data from the input files.
467
468       If desired, it can be terminated earlier by sending an interrupt signal
469       to the process (usually by pressing the keyboard interrupt key which is
470       normally Ctrl-C).  This is a natural requirement in some circumstances,
471       e.g.  when  using SoX to make a recording.  Note that when using SoX to
472       play multiple files, Ctrl-C behaves slightly differently:  pressing  it
473       once  causes  SoX  to skip to the next file; pressing it twice in quick
474       succession causes SoX to exit.
475
476       Another option to stop processing early is to use an effect that has  a
477       time  period  or sample count to determine the stopping point. The trim
478       effect is an example of this.  Once all  effects  chains  have  stopped
479       then SoX will also stop.
480

FILENAMES

482       Filenames can be simple file names, absolute or relative path names, or
483       URLs (input files only).  Note that URL support requires  that  wget(1)
484       is available.
485
486       Note:  Giving SoX an input or output filename that is the same as a SoX
487       effect-name will not  work  since  SoX  will  treat  it  as  an  effect
488       specification.    The  only  work-around  to  this  is  to  avoid  such
489       filenames. This is generally not difficult since most  audio  filenames
490       have a filename `extension', whilst effect-names do not.
491
492   Special Filenames
493       The following special filenames may be used in certain circumstances in
494       place of a normal filename on the command line:
495
496       -      SoX can be used in  simple  pipeline  operations  by  using  the
497              special  filename  `-' which, if used as an input filename, will
498              cause SoX will read audio data from  `standard  input'  (stdin),
499              and  which,  if used as the output filename, will cause SoX will
500              send audio data to `standard output' (stdout).  Note  that  when
501              using  this option for the output file, and sometimes when using
502              it for an input file, the file-type (see -t below) must also  be
503              given.
504
505       "|program [options] ..."
506              This  can  be  used in place of an input filename to specify the
507              the given program's standard output (stdout) be used as an input
508              file.   Unlike - (above), this can be used for several inputs to
509              one SoX command.  For example,  if  `genw'  generates  mono  WAV
510              formatted  signals  to  its  standard output, then the following
511              command makes a stereo file from two generated signals:
512                 sox -M "|genw --imd -" "|genw --thd -" out.wav
513              For  headerless  (raw)  audio,  -t  (and  perhaps  other  format
514              options) will need to be given, preceding the input command.
515
516       "wildcard-filename"
517              Specifies  that  filename `globbing' (wild-card matching) should
518              be performed by SoX instead of by the shell.  This allows a sin‐
519              gle  set of file options to be applied to a group of files.  For
520              example, if the current directory contains  three  `vox'  files,
521              file1.vox, file2.vox, and file3.vox, then
522                 play --rate 6k *.vox
523              will be expanded by the `shell' (in most environments) to
524                 play --rate 6k file1.vox file2.vox file3.vox
525              which will treat only the first vox file as having a sample rate
526              of 6k.  With
527                 play --rate 6k "*.vox"
528              the given sample rate option will be applied to  all  three  vox
529              files.
530
531       -p, --sox-pipe
532              This  can be used in place of an output filename to specify that
533              the SoX command should be used as in input pipe to  another  SoX
534              command.  For example, the command:
535                 play "|sox -n -p synth 2" "|sox -n -p synth 2 tremolo 10" stat
536              plays two `files' in succession, each with different effects.
537
538              -p is in fact an alias for `-t sox -'.
539
540       -d, --default-device
541              This  can  be  used  in  place of an input or output filename to
542              specify that the default audio device (if  one  has  been  built
543              into  SoX)  is to be used.  This is akin to invoking rec or play
544              (as described above).
545
546       -n, --null
547              This can be used in place of an  input  or  output  filename  to
548              specify that a `null file' is to be used.  Note that here, `null
549              file' refers to a SoX-specific mechanism and is not  related  to
550              any operating-system mechanism with a similar name.
551
552              Using a null file to input audio is equivalent to using a normal
553              audio file that contains an infinite amount of silence,  and  as
554              such  is  not  generally  useful unless used with an effect that
555              specifies a finite time length (such as trim or synth).
556
557              Using a null file to output  audio  amounts  to  discarding  the
558              audio and is useful mainly with effects that produce information
559              about the audio instead of affecting it (such  as  noiseprof  or
560              stat).
561
562              The  sampling  rate  associated  with  a null file is by default
563              48 kHz, but, as with a normal file, this can  be  overridden  if
564              desired using command-line format options (see below).
565
566   Supported File & Audio Device Types
567       See  soxformat(7) for a list and description of the supported file for‐
568       mats and audio device drivers.
569

OPTIONS

571   Global Options
572       These options can be specified on the command line at any point  before
573       the first effect name.
574
575       The  SOX_OPTS  environment  variable can be used to provide alternative
576       default values for SoX's global options.  For example:
577          SOX_OPTS="--buffer 20000 --play-rate-arg -hs --temp /mnt/temp"
578       Note that setting SOX_OPTS can potentially create unwanted  changes  in
579       the  behaviour  of scripts or other programs that invoke SoX.  SOX_OPTS
580       might best be used for things (such  as  in  the  given  example)  that
581       reflect  the  environment  in which SoX is being run.  Enabling options
582       such as --no-clobber as default might be handled better using  a  shell
583       alias since a shell alias will not affect operation in scripts etc.
584
585       One  way  to  ensure that a script cannot be affected by SOX_OPTS is to
586       clear SOX_OPTS at the start of the script, but this of course loses the
587       benefit  of  SOX_OPTS  carrying  some  system-wide default options.  An
588       alternative approach is to explicitly invoke SoX  with  default  option
589       values, e.g.
590          SOX_OPTS="-V --no-clobber"
591          ...
592          sox -V2 --clobber $input $output ...
593       Note  that  the  way to set environment variables varies from system to
594       system. Here are some examples:
595
596       Unix bash:
597          export SOX_OPTS="-V --no-clobber"
598       Unix csh:
599          setenv SOX_OPTS "-V --no-clobber"
600       MS-DOS/MS-Windows:
601          set SOX_OPTS=-V --no-clobber
602       MS-Windows GUI: via Control Panel : System  :  Advanced  :  Environment
603       Variables
604
605       Mac OS X GUI: Refer to Apple's Technical Q&A QA1067 document.
606
607       --buffer BYTES, --input-buffer BYTES
608              Set  the  size in bytes of the buffers used for processing audio
609              (default 8192).  --buffer applies to input, effects, and  output
610              processing; --input-buffer applies only to input processing (for
611              which it overrides --buffer if both are given).
612
613              Be aware that large values for --buffer will  cause  SoX  to  be
614              become  slow  to respond to requests to terminate or to skip the
615              current input file.
616
617       --clobber
618              Don't prompt before overwriting an existing file with  the  same
619              name as that given for the output file.  This is the default be‐
620              haviour.
621
622       --combine concatenate|merge|mix|mix-power|multiply|sequence
623              Select the input file combining method; for some of these, short
624              options are available: -m selects `mix', -M selects `merge', and
625              -T selects `multiply'.
626
627              See Input File Combining above for a description of the  differ‐
628              ent combining methods.
629
630       -D, --no-dither
631              Disable  automatic  dither  - see `Dither' above.  An example of
632              why this might occasionally be useful is if a file has been con‐
633              verted  from  16 to 24 bit with the intention of doing some pro‐
634              cessing on it, but in fact no processing is needed after all and
635              the original 16 bit file has been lost, then, strictly speaking,
636              no dither is needed if converting the file back to 16 bit.   See
637              also  the stats effect for how to determine the actual bit depth
638              of the audio within a file.
639
640       --effects-file FILENAME
641              Use FILENAME to obtain all effects  and  their  arguments.   The
642              file  is  parsed  as if the values were specified on the command
643              line.  A new line can be used in place of the special ":" marker
644              to separate effect chains.  This option causes any effects spec‐
645              ified on the command line to be discarded.
646
647       -G, --guard
648              Automatically invoke the gain effect to guard against  clipping.
649              E.g.
650                 sox -G infile -b 16 outfile rate 44100 dither -s
651              is shorthand for
652                 sox infile -b 16 outfile gain -h rate 44100 gain -rh dither -s
653              See also -V, --norm, and the gain effect.
654
655       -h, --help
656              Show version number and usage information.
657
658       --help-effect NAME
659              Show  usage  information  on the specified effect.  The name all
660              can be used to show usage on all effects.
661
662       --help-format NAME
663              Show information about the specified file format.  The name  all
664              can be used to show information on all formats.
665
666       --i, --info
667              Only if given as the first parameter to sox, behave as soxi(1).
668
669       --interactive
670              Deprecated alias for --no-clobber.
671
672       -m|-M  Equivalent to --combine mix and --combine merge, respectively.
673
674       --magic
675              If  SoX has been built with the optional `libmagic' library then
676              this option can be given to enable its use in helping to  detect
677              audio file types.
678
679       --multi-threaded | --single-threaded
680              By  default,  SoX is `single threaded'.  If the --multi-threaded
681              option is given however then SoX will process audio channels for
682              most multi-channel effects in parallel on hyper-threading/multi-
683              core architectures. This  may  reduce  processing  time,  though
684              sometimes  it  may be necessary to use this option in conjuction
685              with a larger buffer size than is the default to gain any  bene‐
686              fit  from  multi-threaded  processing (e.g. 131072; see --buffer
687              above).
688
689       --no-clobber
690              Prompt before overwriting an existing file with the same name as
691              that given for the output file.
692
693              N.B.   Unintentionally  overwriting  a  file  is easier than you
694              might think, for example, if you accidentally enter
695                 sox file1 file2 effect1 effect2 ...
696              when what you really meant was
697                 play file1 file2 effect1 effect2 ...
698              then, without this option, file2 will  be  overwritten.   Hence,
699              using  this  option  is recommended. SOX_OPTS (above), a `shell'
700              alias, script, or batch file may be an appropriate way of perma‐
701              nently enabling it.
702
703       --norm Automatically  invoke  the gain effect to guard against clipping
704              and to normalise the audio. E.g.
705                 sox --norm infile -b 16 outfile rate 44100 dither -s
706              is shorthand for
707                 sox infile -b 16 outfile gain -h rate 44100 gain -nh dither -s
708              See also -V, -G, and the gain effect.
709
710       --play-rate-arg ARG
711              Selects a quality option to be used when the  `rate'  effect  is
712              automatically invoked whilst playing audio.  This option is typ‐
713              ically set via the SOX_OPTS environment variable (see above).
714
715       --plot gnuplot|octave|off
716              If not set to off (the default if --plot is not given), run in a
717              mode  that  can be used, in conjunction with the gnuplot program
718              or the GNU Octave program, to assist with the selection and con‐
719              figuration  of many of the transfer-function based effects.  For
720              the first given effect that supports the selected plotting  pro‐
721              gram,  SoX  will  output  commands to plot the effect's transfer
722              function, and then exit without actually processing  any  audio.
723              E.g.
724                 sox --plot octave input-file -n highpass 1320 > highpass.plt
725                 octave highpass.plt
726
727       -q, --no-show-progress
728              Run  in  quiet  mode when SoX wouldn't otherwise do so.  This is
729              the opposite of the -S option.
730
731       -R     Run in `repeatable' mode.  When  this  option  is  given,  where
732              applicable, SoX will embed a fixed time-stamp in the output file
733              (e.g.  AIFF) and will `seed'  pseudo  random  number  generators
734              (e.g.   dither)  with a fixed number, thus ensuring that succes‐
735              sive SoX invocations with the same inputs and the  same  parame‐
736              ters yield the same output.
737
738       --replay-gain track|album|off
739              Select  whether  or not to apply replay-gain adjustment to input
740              files.  The default is off for sox and rec, album for play where
741              (at  least)  the  first two input files are tagged with the same
742              Artist and Album names, and track for play otherwise.
743
744       -S, --show-progress
745              Display input file  format/header  information,  and  processing
746              progress as input file(s) percentage complete, elapsed time, and
747              remaining time (if known; shown in brackets), and the number  of
748              samples  written to the output file.  Also shown is a peak-level
749              meter, and an indication if clipping has  occurred.   The  peak-
750              level meter shows up to two channels and is calibrated for digi‐
751              tal audio as follows (right channel shown):
752
753                            dB FSD   Display   dB FSD   Display
754                             -25     -          -11     ====
755                             -23     =           -9     ====-
756                             -21     =-          -7     =====
757                             -19     ==          -5     =====-
758                             -17     ==-         -3     ======
759                             -15     ===         -1     =====!
760                             -13     ===-
761
762              A three-second peak-held value of headroom in dBs will be  shown
763              to the right of the meter if this is below 6dB.
764
765              This  option  is  enabled  by  default when using SoX to play or
766              record audio.
767
768       -T     Equivalent to --combine multiply.
769
770       --temp DIRECTORY
771              Specify that any temporary files should be created in the  given
772              DIRECTORY.   This can be useful if there are permission or free-
773              space problems with the default location. In  this  case,  using
774              `--temp  .' (to use the current directory) is often a good solu‐
775              tion.
776
777       --version
778              Show SoX's version number and exit.
779
780       -V[level]
781              Set verbosity. This is particularly useful for  seeing  how  any
782              automatic effects have been invoked by SoX.
783
784              SoX  displays  messages on the console (stderr) according to the
785              following verbosity levels:
786
787              0      No messages are shown at all;  use  the  exit  status  to
788                     determine if an error has occurred.
789
790              1      Only  error  messages  are shown.  These are generated if
791                     SoX cannot complete the requested commands.
792
793              2      Warning messages are also shown.  These are generated  if
794                     SoX  can complete the requested commands, but not exactly
795                     according to the  requested  command  parameters,  or  if
796                     clipping occurs.
797
798              3      Descriptions  of  SoX's processing phases are also shown.
799                     Useful for seeing exactly  how  SoX  is  processing  your
800                     audio.
801
802              4 and above
803                     Messages to help with debugging SoX are also shown.
804
805              By  default,  the  verbosity level is set to 2 (shows errors and
806              warnings). Each occurrence of the -V option increases  the  ver‐
807              bosity  level  by  1.  Alternatively, the verbosity level can be
808              set to an absolute number by specifying it immediately after the
809              -V, e.g.  -V0 sets it to 0.
810
811   Input File Options
812       These  options  apply  only  to  input files and may precede only input
813       filenames on the command line.
814
815       --ignore-length
816              Override an (incorrect) audio length given in  an  audio  file's
817              header. If this option is given then SoX will keep reading audio
818              until it reaches the end of the input file.
819
820       -v, --volume FACTOR
821              Intended for use  when  combining  multiple  input  files,  this
822              option  adjusts  the  volume  of the file that follows it on the
823              command line by a factor of FACTOR. This allows it to  be  `bal‐
824              anced'  w.r.t.  the other input files.  This is a linear (ampli‐
825              tude) adjustment, so a number less than 1 decreases  the  volume
826              and  a number greater than 1 increases it.  If a negative number
827              is given then in addition to the volume  adjustment,  the  audio
828              signal will be inverted.
829
830              See  also  the  norm,  vol, and gain effects, and see Input File
831              Balancing above.
832
833   Input & Output File Format Options
834       These options apply to the input or output file whose name they immedi‐
835       ately precede on the command line and are used mainly when working with
836       headerless file formats or when specifying a format for the output file
837       that is different to that of the input file.
838
839       -b BITS, --bits BITS
840              The  number  of bits (a.k.a. bit-depth or sometimes word-length)
841              in each encoded sample.  Not  applicable  to  complex  encodings
842              such  as  MP3  or GSM.  Not necessary with encodings that have a
843              fixed number of bits, e.g.  A/μ-law, ADPCM.
844
845              For an input file, the most common use for  this  option  is  to
846              inform SoX of the number of bits per sample in a `raw' (`header‐
847              less') audio file.  For example
848                 sox -r 16k -e signed -b 8 input.raw output.wav
849              converts a particular `raw'  file  to  a  self-describing  `WAV'
850              file.
851
852              For  an output file, this option can be used (perhaps along with
853              -e) to set the output encoding size.  By default (i.e.  if  this
854              option  is  not given), the output encoding size will (providing
855              it is supported by the output file type) be  set  to  the  input
856              encoding size.  For example
857                 sox input.cdda -b 24 output.wav
858              converts  raw  CD  digital  audio  (16-bit, signed-integer) to a
859              24-bit (signed-integer) `WAV' file.
860
861       -1/-2/-3/-4/-8
862              The number of bytes in each encoded sample.  Deprecated  aliases
863              for -b 8, -b 16, -b 24, -b 32, -b 64 respectively.
864
865       -c CHANNELS, --channels CHANNELS
866              The  number of audio channels in the audio file. This can be any
867              number greater than zero.
868
869              For an input file, the most common use for  this  option  is  to
870              inform  SoX  of the number of channels in a `raw' (`headerless')
871              audio file.  Occasionally, it may be useful to use  this  option
872              with  a  `headered'  file,  in order to override the (presumably
873              incorrect) value in the header - note that  this  is  only  sup‐
874              ported with certain file types.  Examples:
875                 sox -r 48k -e float -b 32 -c 2 input.raw output.wav
876              converts  a  particular  `raw'  file  to a self-describing `WAV'
877              file.
878                 play -c 1 music.wav
879              interprets the file  data  as  belonging  to  a  single  channel
880              regardless  of  what is indicated in the file header.  Note that
881              if the file does in fact have two channels, this will result  in
882              the file playing at half speed.
883
884              For  an output file, this option provides a shorthand for speci‐
885              fying that the channels effect should be  invoked  in  order  to
886              change (if necessary) the number of channels in the audio signal
887              to the number given.  For example, the  following  two  commands
888              are equivalent:
889                 sox input.wav -c 1 output.wav bass -3
890                 sox input.wav      output.wav bass -3 channels 1
891              though the second form is more flexible as it allows the effects
892              to be ordered arbitrarily.
893
894       -e ENCODING, --encoding ENCODING
895              The audio encoding type.  Sometimes needed with file-types  that
896              support more than one encoding type. For example, with raw, WAV,
897              or AU (but not, for example, with MP3 or FLAC).   The  available
898              encoding types are as follows:
899
900              signed-integer
901                     PCM  data stored as signed (`two's complement') integers.
902                     Commonly used with a 16 or  24  -bit  encoding  size.   A
903                     value of 0 represents minimum signal power.
904
905              unsigned-integer
906                     PCM  data stored as signed (`two's complement') integers.
907                     Commonly used with an 8-bit encoding size.  A value of  0
908                     represents maximum signal power.
909
910              floating-point
911                     PCM  data stored as IEEE 753 single precision (32-bit) or
912                     double precision (64-bit)  floating-point  (`real')  num‐
913                     bers.  A value of 0 represents minimum signal power.
914
915              a-law  International telephony standard for logarithmic encoding
916                     to 8 bits per sample.  It has a precision  equivalent  to
917                     roughly 13-bit PCM and is sometimes encoded with reversed
918                     bit-ordering (see the -X option).
919
920              u-law, mu-law
921                     North American telephony standard for logarithmic  encod‐
922                     ing to 8 bits per sample.  A.k.a. μ-law.  It has a preci‐
923                     sion equivalent to roughly 14-bit PCM  and  is  sometimes
924                     encoded with reversed bit-ordering (see the -X option).
925
926              oki-adpcm
927                     OKI  (a.k.a. VOX, Dialogic, or Intel) 4-bit ADPCM; it has
928                     a precision equivalent to roughly 12-bit PCM.  ADPCM is a
929                     form  of  audio  compression  that  has a good compromise
930                     between audio quality and encoding/decoding speed.
931
932              ima-adpcm
933                     IMA (a.k.a. DVI) 4-bit ADPCM; it has a precision  equiva‐
934                     lent to roughly 13-bit PCM.
935
936              ms-adpcm
937                     Microsoft  4-bit  ADPCM; it has a precision equivalent to
938                     roughly 14-bit PCM.
939
940              gsm-full-rate
941                     GSM is currently  used  for  the  vast  majority  of  the
942                     world's  digital  wireless  telephone calls.  It utilises
943                     several audio formats with different bit-rates and  asso‐
944                     ciated  speech quality.  SoX has support for GSM's origi‐
945                     nal 13kbps `Full Rate' audio format.  It is usually  CPU-
946                     intensive to work with GSM audio.
947
948              Encoding  names  can  be  abbreviated  where  this  would not be
949              ambiguous; e.g. `unsigned-integer' can be given as `un', but not
950              `u' (ambiguous with `u-law').
951
952              For  an  input  file,  the most common use for this option is to
953              inform SoX of the encoding of a `raw' (`headerless') audio  file
954              (see the examples in -b and -c above).
955
956              For  an output file, this option can be used (perhaps along with
957              -b) to set the output encoding type  For example
958                 sox input.cdda -e float output1.wav
959
960                 sox input.cdda -b 64 -e float output2.wav
961              convert raw CD digital audio (16-bit, signed-integer) to  float‐
962              ing-point `WAV' files (single & double precision respectively).
963
964              By default (i.e. if this option is not given), the output encod‐
965              ing type will (providing it is  supported  by  the  output  file
966              type) be set to the input encoding type.
967
968       -s/-u/-f/-A/-U/-o/-i/-a/-g
969              Deprecated  aliases  for  specifying  the encoding types signed-
970              integer, unsigned-integer, floating-point, mu-law,  a-law,  oki-
971              adpcm,  ima-adpcm,  ms-adpcm, gsm-full-rate respectively (see -e
972              above).
973
974       --no-glob
975              Specifies that filename `globbing' (wild-card  matching)  should
976              not be performed by SoX on the following filename.  For example,
977              if the current  directory  contains  the  two  files  `five-sec‐
978              onds.wav' and `five*.wav', then
979                 play --no-glob "five*.wav"
980              can be used to play just the single file `five*.wav'.
981
982       -r, --rate RATE[k]
983              Gives the sample rate in Hz (or kHz if appended with `k') of the
984              file.
985
986              For an input file, the most common use for  this  option  is  to
987              inform  SoX  of  the sample rate of a `raw' (`headerless') audio
988              file (see the examples in -b and -c above).  Occasionally it may
989              be useful to use this option with a `headered' file, in order to
990              override the (presumably incorrect) value in the header  -  note
991              that  this is only supported with certain file types.  For exam‐
992              ple, if audio was recorded with a sample-rate of say 48k from  a
993              source that played back a little, say 1.5%, too slowly, then
994                 sox -r 48720 input.wav output.wav
995              effectively  corrects the speed by changing only the file header
996              (but see also the speed effect for the more  usual  solution  to
997              this problem).
998
999              For  an output file, this option provides a shorthand for speci‐
1000              fying that the rate effect should be invoked in order to  change
1001              (if  necessary) the sample rate of the audio signal to the given
1002              value.  For example, the following two commands are equivalent:
1003                 sox input.wav -r 48k output.wav bass -3
1004                 sox input.wav        output.wav bass -3 rate 48k
1005              though the second form  is  more  flexible  as  it  allows  rate
1006              options  to be given, and allows the effects to be ordered arbi‐
1007              trarily.
1008
1009       -t, --type FILE-TYPE
1010              Gives the type of the audio file.  For  both  input  and  output
1011              files,  this option is commonly used to inform SoX of the type a
1012              `headerless' audio file (e.g. raw, mp3) where the actual/desired
1013              type  cannot be determined from a given filename extension.  For
1014              example:
1015                 another-command | sox -t mp3 - output.wav
1016
1017                 sox input.wav -t raw output.bin
1018              It can also be used to override the type  implied  by  an  input
1019              filename  extension,  but  if  overriding with a type that has a
1020              header, SoX will exit with an appropriate error message if  such
1021              a header is not actually present.
1022
1023              See soxformat(7) for a list of supported file types.
1024
1025       -L, --endian little
1026       -B, --endian big
1027       -x, --endian swap
1028              These  options  specify whether the byte-order of the audio data
1029              is, respectively, `little endian', `big endian', or the opposite
1030              to  that  of  the system on which SoX is being used.  Endianness
1031              applies only to data encoded as floating-pont, or as  signed  or
1032              unsigned  integers of 16 or more bits.  It is often necessary to
1033              specify one of these options for headerless files, and sometimes
1034              necessary   for  (otherwise)  self-describing  files.   A  given
1035              endian-setting option may be ignored for  an  input  file  whose
1036              header contains a specific endianness identifier, or for an out‐
1037              put file that is actually an audio device.
1038
1039              N.B.  Unlike other format characteristics, the endianness (byte,
1040              nibble,  &  bit ordering) of the input file is not automatically
1041              used for the output file; so, for example, when the following is
1042              run on a little-endian system:
1043                 sox -B audio.s16 trimmed.s16 trim 2
1044              trimmed.s16 will be created as little-endian;
1045                 sox -B audio.s16 -B trimmed.s16 trim 2
1046              must be used to preserve big-endianness in the output file.
1047
1048              The -V option can be used to check the selected orderings.
1049
1050       -N, --reverse-nibbles
1051              Specifies that the nibble ordering (i.e. the 2 halves of a byte)
1052              of the samples should be reversed; sometimes useful with  ADPCM-
1053              based formats.
1054
1055              N.B.  See also N.B. in section on -x above.
1056
1057       -X, --reverse-bits
1058              Specifies  that  the  bit  ordering  of  the  samples  should be
1059              reversed; sometimes useful with a few (mostly  headerless)  for‐
1060              mats.
1061
1062              N.B.  See also N.B. in section on -x above.
1063
1064   Output File Format Options
1065       These  options  apply  only to the output file and may precede only the
1066       output filename on the command line.
1067
1068       --add-comment TEXT
1069              Append a comment in the output file header (where applicable).
1070
1071       --comment TEXT
1072              Specify the comment text to store  in  the  output  file  header
1073              (where applicable).
1074
1075              SoX  will  provide  a  default comment if this option (or --com‐
1076              ment-file) is not given. To specify that no  comment  should  be
1077              stored in the output file, use --comment "" .
1078
1079       --comment-file FILENAME
1080              Specify  a file containing the comment text to store in the out‐
1081              put file header (where applicable).
1082
1083       -C, --compression FACTOR
1084              The compression factor for variably compressing output file for‐
1085              mats.   If  this  option is not given then a default compression
1086              factor will apply.  The compression factor is  interpreted  dif‐
1087              ferently  for  different  compressing  file  formats.   See  the
1088              description of the file formats that use this option in  soxfor‐
1089              mat(7) for more information.
1090

EFFECTS

1092       In  addition  to converting, playing and recording audio files, SoX can
1093       be used to invoke a number of audio `effects'.  Multiple effects may be
1094       applied by specifying them one after another at the end of the SoX com‐
1095       mand line, forming an `effects chain'.   Note  that  applying  multiple
1096       effects  in  real-time (i.e. when playing audio) is likely to require a
1097       high performance computer. Stopping other  applications  may  alleviate
1098       performance issues should they occur.
1099
1100       Some  of the SoX effects are primarily intended to be applied to a sin‐
1101       gle instrument or `voice'.  To facilitate this, the  remix  effect  and
1102       the  global  SoX option -M can be used to isolate then recombine tracks
1103       from a multi-track recording.
1104
1105   Multiple Effect Chains
1106       A single effects chain is made up of one or more effects.   Audio  from
1107       the input runs through the chain until either the end of the input file
1108       is reached or an effect in the chain requests to terminate the chain.
1109
1110       SoX supports running multiple effects chains over the input audio.   In
1111       this  case,  when  one chain indicates it is done processing audio, the
1112       audio data is then sent through the next effects chain.  This continues
1113       until  either no more effects chains exist or the input has reached the
1114       end of the file.
1115
1116       An effects chain is terminated by placing a : (colon) after an  effect.
1117       Any following effects are a part of a new effects chain.
1118
1119       It  is  important  to  place the effect that will stop the chain as the
1120       first effect in the chain.   This  is  because  any  samples  that  are
1121       buffered  by effects to the left of the terminating effect will be dis‐
1122       carded.  The amount of samples discarded is  related  to  the  --buffer
1123       option and it should be kept small, relative to the sample rate, if the
1124       terminating effect cannot be first.  Further  information  on  stopping
1125       effects can be found in the Stopping SoX section.
1126
1127       There  are a few pseudo-effects that aid using multiple effects chains.
1128       These include newfile which will start writing to  a  new  output  file
1129       before  moving  to  the  next effects chain and restart which will move
1130       back to the first effects chain.  Pseudo-effects must be  specified  as
1131       the  first  effect  in  a chain and as the only effect in a chain (they
1132       must have a : before and after they are specified).
1133
1134       The following is an example of multiple effects chains.  It will  split
1135       the  input file into multiple files of 30 seconds in length.  Each out‐
1136       put filename will have unique number in its name as documented  in  the
1137       Output Files section.
1138          sox infile.wav output.wav trim 0 30 : newfile : restart
1139
1140   Common Notation And Parameters
1141       In the descriptions that follow, brackets [ ] are used to denote param‐
1142       eters that are optional, braces { }  to  denote  those  that  are  both
1143       optional  and  repeatable,  and angle brackets < > to denote those that
1144       are repeatable but not optional.  Where applicable, default values  for
1145       optional parameters are shown in parenthesis ( ).
1146
1147       The  following parameters are used with, and have the same meaning for,
1148       several effects:
1149
1150       centre[k]
1151              See frequency.
1152
1153       frequency[k]
1154              A frequency in Hz, or, if appended with `k', kHz.
1155
1156       gain   A power gain in dB.  Zero gives no gain; less than zero gives an
1157              attenuation.
1158
1159       width[h|k|o|q]
1160              Used to specify the band-width of a filter.  A number of differ‐
1161              ent methods to specify the width are available (though  not  all
1162              for  every effect).  One of the characters shown may be appended
1163              to select the desired method as follows:
1164
1165                                        Method    Notes
1166                                   h      Hz
1167                                   k     kHz
1168                                   o   Octaves
1169                                   q   Q-factor   See [2]
1170
1171              For each effect that uses this  parameter,  the  default  method
1172              (i.e.  if  no  character  is appended) is the one that it listed
1173              first in the first line of the effect's description.
1174
1175       To see if SoX has support for an optional effect, enter sox -h and look
1176       for its name under the list: `EFFECTS'.
1177
1178   Supported Effects
1179       Note:  a categorised list of the effects can be found in the accompany‐
1180       ing `README' file.
1181
1182       allpass frequency[k] width[h|k|o|q]
1183              Apply a two-pole all-pass filter with central frequency (in  Hz)
1184              frequency,  and  filter-width width.  An all-pass filter changes
1185              the audio's frequency to phase relationship without changing its
1186              frequency to amplitude relationship.  The filter is described in
1187              detail in [1].
1188
1189              This effect supports the --plot global option.
1190
1191       band [-n] center[k] [width[h|k|o|q]]
1192              Apply a band-pass filter.  The frequency  response  drops  loga‐
1193              rithmically  around  the  center frequency.  The width parameter
1194              gives the slope of the drop.  The frequencies at center +  width
1195              and  center  -  width will be half of their original amplitudes.
1196              band defaults to a mode oriented to pitched audio,  i.e.  voice,
1197              singing,  or instrumental music.  The -n (for noise) option uses
1198              the alternate  mode  for  un-pitched  audio  (e.g.  percussion).
1199              Warning: -n introduces a power-gain of about 11dB in the filter,
1200              so beware of output clipping.   band  introduces  noise  in  the
1201              shape  of  the  filter, i.e. peaking at the center frequency and
1202              settling around it.
1203
1204              This effect supports the --plot global option.
1205
1206              See also sinc for a bandpass filter with steeper shoulders.
1207
1208       bandpass|bandreject [-c] frequency[k] width[h|k|o|q]
1209              Apply a two-pole Butterworth  band-pass  or  band-reject  filter
1210              with  central  frequency  frequency,  and (3dB-point) band-width
1211              width.  The -c option applies only to  bandpass  and  selects  a
1212              constant skirt gain (peak gain = Q) instead of the default: con‐
1213              stant 0dB peak gain.  The filters roll off  at  6dB  per  octave
1214              (20dB per decade) and are described in detail in [1].
1215
1216              These effects support the --plot global option.
1217
1218              See also sinc for a bandpass filter with steeper shoulders.
1219
1220       bandreject frequency[k] width[h|k|o|q]
1221              Apply a band-reject filter.  See the description of the bandpass
1222              effect for details.
1223
1224       bass|treble gain [frequency[k] [width[s|h|k|o|q]]]
1225              Boost or cut the bass (lower) or treble (upper)  frequencies  of
1226              the audio using a two-pole shelving filter with a response simi‐
1227              lar to that of a standard hi-fi's tone-controls.  This  is  also
1228              known as shelving equalisation (EQ).
1229
1230              gain  gives  the  gain  at  0 Hz (for bass), or whichever is the
1231              lower of ∼22 kHz and the Nyquist frequency  (for  treble).   Its
1232              useful  range is about -20 (for a large cut) to +20 (for a large
1233              boost).  Beware of Clipping when using a positive gain.
1234
1235              If desired, the filter can be  fine-tuned  using  the  following
1236              optional parameters:
1237
1238              frequency sets the filter's central frequency and so can be used
1239              to extend or reduce the frequency range to be  boosted  or  cut.
1240              The default value is 100 Hz (for bass) or 3 kHz (for treble).
1241
1242              width determines how steep is the filter's shelf transition.  In
1243              addition to the common  width  specification  methods  described
1244              above,  `slope'  (the  default,  or if appended with `s') may be
1245              used.  The useful range of `slope' is about 0.3,  for  a  gentle
1246              slope,  to 1 (the maximum), for a steep slope; the default value
1247              is 0.5.
1248
1249              The filters are described in detail in [1].
1250
1251              These effects support the --plot global option.
1252
1253              See also equalizer for a peaking equalisation effect.
1254
1255       bend [-f frame-rate(25)] [-o over-sample(16)] { delay,cents,duration }
1256              Changes pitch by specified amounts  at  specified  times.   Each
1257              given triple: delay,cents,duration specifies one bend.  delay is
1258              the amount of time after the start of the audio stream,  or  the
1259              end  of  the previous bend, at which to start bending the pitch;
1260              cents is the number of cents (100 cents = 1 semitone)  by  which
1261              to  bend  the  pitch, and duration the length of time over which
1262              the pitch will be bent.
1263
1264              The pitch-bending algorithm utilises the Discrete Fourier Trans‐
1265              form  (DFT)  at  a particular frame rate and over-sampling rate.
1266              The -f and -o parameters may be used to adjust these  parameters
1267              and thus control the smoothness of the changes in pitch.
1268
1269              For  example,  an  initial  tone  is  generated, then bent three
1270              times, yielding four different notes in total:
1271                 play -n synth 2.5 sin 667 gain 1 \
1272                   bend .35,180,.25  .15,740,.53  0,-520,.3
1273              Note that the clipping that  is  produced  in  this  example  is
1274              deliberate; to remove it, use gain -5 in place of gain 1.
1275
1276       biquad b0 b1 b2 a0 a1 a2
1277              Apply  a biquad IIR filter with the given coefficients. Where b*
1278              and a* are the numerator and  denominator  coefficients  respec‐
1279              tively.
1280
1281              See http://en.wikipedia.org/wiki/Digital_biquad_filter (where a0
1282              = 1).
1283
1284       channels CHANNELS
1285              Invoke a simple algorithm to change the number  of  channels  in
1286              the  audio  signal  to  the  given  number  CHANNELS:  mixing if
1287              decreasing the number of channels or duplicating  if  increasing
1288              the number of channels.
1289
1290              The  channels effect is invoked automatically if SoX's -c option
1291              specifies a number of channels that is different to that of  the
1292              input  file(s).   Alternatively, if this effect is given explic‐
1293              itly, then SoX's -c option need not be given.  For example,  the
1294              following two commands are equivalent:
1295                 sox input.wav -c 1 output.wav bass -3
1296                 sox input.wav      output.wav bass -3 channels 1
1297              though the second form is more flexible as it allows the effects
1298              to be ordered arbitrarily.
1299
1300              See also  remix  for  an  effect  that  allows  channels  to  be
1301              mixed/selected arbitrarily.
1302
1303       chorus gain-in gain-out <delay decay speed depth -s|-t>
1304              Add  a chorus effect to the audio.  This can make a single vocal
1305              sound like a chorus, but can also be applied to instrumentation.
1306
1307              Chorus resembles an echo effect with a short delay, but  whereas
1308              with echo the delay is constant, with chorus, it is varied using
1309              sinusoidal  or  triangular  modulation.   The  modulation  depth
1310              defines  the range the modulated delay is played before or after
1311              the delay. Hence the delayed sound will sound slower or  faster,
1312              that is the delayed sound tuned around the original one, like in
1313              a chorus where some vocals are slightly off key.   See  [3]  for
1314              more discussion of the chorus effect.
1315
1316              Each  four-tuple  parameter  delay/decay/speed/depth  gives  the
1317              delay in milliseconds and the decay (relative to gain-in) with a
1318              modulation speed in Hz using depth in milliseconds.  The modula‐
1319              tion is either sinusoidal (-s) or triangular (-t).  Gain-out  is
1320              the volume of the output.
1321
1322              A  typical delay is around 40ms to 60ms; the modulation speed is
1323              best near 0.25Hz and the modulation depth around 2ms.  For exam‐
1324              ple, a single delay:
1325                 play guitar1.wav chorus 0.7 0.9 55 0.4 0.25 2 -t
1326              Two delays of the original samples:
1327                 play guitar1.wav chorus 0.6 0.9 50 0.4 0.25 2 -t \
1328                    60 0.32 0.4 1.3 -s
1329              A fuller sounding chorus (with three additional delays):
1330                 play guitar1.wav chorus 0.5 0.9 50 0.4 0.25 2 -t \
1331                    60 0.32 0.4 2.3 -t 40 0.3 0.3 1.3 -s
1332
1333       compand attack1,decay1{,attack2,decay2}
1334              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB2,out-dB2}
1335              [gain [initial-volume-dB [delay]]]
1336
1337              Compand (compress or expand) the dynamic range of the audio.
1338
1339              The  attack and decay parameters (in seconds) determine the time
1340              over which the instantaneous level of the input signal is  aver‐
1341              aged to determine its volume; attacks refer to increases in vol‐
1342              ume and decays refer to decreases.   For  most  situations,  the
1343              attack  time  (response  to  the music getting louder) should be
1344              shorter than the decay time because the human ear is more sensi‐
1345              tive  to  sudden  loud music than sudden soft music.  Where more
1346              than one pair of attack/decay  parameters  are  specified,  each
1347              input  channel  is  companded separately and the number of pairs
1348              must agree with the number of input  channels.   Typical  values
1349              are 0.3,0.8 seconds.
1350
1351              The  second  parameter  is  a  list of points on the compander's
1352              transfer function specified in dB relative to the maximum possi‐
1353              ble  signal  amplitude.   The input values must be in a strictly
1354              increasing order but the transfer function does not have  to  be
1355              monotonically rising.  If omitted, the value of out-dB1 defaults
1356              to the same value as in-dB1; levels below in-dB1  are  not  com‐
1357              panded  (but  may  have gain applied to them).  The point 0,0 is
1358              assumed but may be overridden (by 0,out-dBn).  If  the  list  is
1359              preceded by a soft-knee-dB value, then the points at where adja‐
1360              cent line segments on the transfer function meet will be rounded
1361              by  the  amount given.  Typical values for the transfer function
1362              are 6:-70,-60,-20.
1363
1364              The third (optional) parameter is an additional gain in dB to be
1365              applied  at  all points on the transfer function and allows easy
1366              adjustment of the overall gain.
1367
1368              The fourth (optional)  parameter  is  an  initial  level  to  be
1369              assumed  for  each channel when companding starts.  This permits
1370              the user to supply a nominal level initially, so that, for exam‐
1371              ple,  a  very large gain is not applied to initial signal levels
1372              before the companding action has begun to operate: it  is  quite
1373              probable  that  in  such  an event, the output would be severely
1374              clipped while the compander gain  properly  adjusts  itself.   A
1375              typical value (for audio which is initially quiet) is -90 dB.
1376
1377              The fifth (optional) parameter is a delay in seconds.  The input
1378              signal is analysed immediately to control the compander, but  it
1379              is  delayed before being fed to the volume adjuster.  Specifying
1380              a delay approximately equal to the attack/decay times allows the
1381              compander to effectively operate in a `predictive' rather than a
1382              reactive mode.  A typical value is 0.2 seconds.
1383
1384                                    *        *        *
1385
1386              The following example might be used to make  a  piece  of  music
1387              with both quiet and loud passages suitable for listening to in a
1388              noisy environment such as a moving vehicle:
1389                 sox asz.wav asz-car.wav compand 0.3,1 6:-70,-60,-20 -5 -90 0.2
1390              The transfer function (`6:-70,...') says that very  soft  sounds
1391              (below -70dB) will remain unchanged.  This will stop the compan‐
1392              der from boosting  the  volume  on  `silent'  passages  such  as
1393              between  movements.   However,  sounds in the range -60dB to 0dB
1394              (maximum volume) will be boosted so that the 60dB dynamic  range
1395              of  the  original  music  will  be compressed 3-to-1 into a 20dB
1396              range, which is wide enough to enjoy the music but narrow enough
1397              to  get  around  the road noise.  The `6:' selects 6dB soft-knee
1398              companding.  The -5 (dB) output gain is needed to avoid clipping
1399              (the  number  is  inexact,  and was derived by experimentation).
1400              The -90 (dB) for the initial volume will work fine  for  a  clip
1401              that  starts  with  near silence, and the delay of 0.2 (seconds)
1402              has the effect of causing the compander  to  react  a  bit  more
1403              quickly to sudden volume changes.
1404
1405              In  the  next example, compand is being used as a noise-gate for
1406              when the noise is at a lower level than the signal:
1407                 play infile compand .1,.2 -inf,-50.1,-inf,-50,-50 0 -90 .1
1408              Here is another noise-gate, this time for when the noise is at a
1409              higher  level  than the signal (making it, in some ways, similar
1410              to squelch):
1411                 play infile compand .1,.1 -45.1,-45,-inf,0,-inf 45 -90 .1
1412              This effect supports the --plot global option (for the  transfer
1413              function).
1414
1415              See also mcompand for a multiple-band companding effect.
1416
1417       contrast [enhancement-amount(75)]
1418              Comparable  with compression, this effect modifies an audio sig‐
1419              nal to make it sound louder.   enhancement-amount  controls  the
1420              amount  of  the  enhancement and is a number in the range 0-100.
1421              Note that enhancement-amount = 0 still gives a significant  con‐
1422              trast enhancement.
1423
1424              See also the compand and mcompand effects.
1425
1426       dcshift shift [limitergain]
1427              Apply  a  DC shift to the audio.  This can be useful to remove a
1428              DC offset (caused perhaps by a hardware problem in the recording
1429              chain)  from  the  audio.   The effect of a DC offset is reduced
1430              headroom and hence volume.  The stat or stats effect can be used
1431              to determine if a signal has a DC offset.
1432
1433              The  given dcshift value is a floating point number in the range
1434              of ±2 that indicates the amount to shift the audio (which is  in
1435              the range of ±1).
1436
1437              An  optional  limitergain  can  be specified as well.  It should
1438              have a value much less than 1 (e.g. 0.05 or 0.02)  and  is  used
1439              only on peaks to prevent clipping.
1440
1441                                    *        *        *
1442
1443              An  alternative  approach to removing a DC offset (albeit with a
1444              short delay) is to use the highpass filter effect at a frequency
1445              of say 10Hz, as illustrated in the following example:
1446                 sox -n dc.wav synth 5 sin %0 50
1447                 sox dc.wav fixed.wav highpass 10
1448
1449       deemph Apply Compact Disc (IEC 60908) de-emphasis (a treble attenuation
1450              shelving filter).
1451
1452              Pre-emphasis was applied in the mastering of some CDs issued  in
1453              the early 1980s.  These included many classical music albums, as
1454              well as now sought-after issues of albums by The  Beatles,  Pink
1455              Floyd  and  others.   Pre-emphasis should be removed at playback
1456              time by a de-emphasis filter in the playback  device.   However,
1457              not  all  modern CD players have this filter, and very few PC CD
1458              drives have it; playing pre-emphasised audio without the correct
1459              de-emphasis filter results in audio that sounds harsh and is far
1460              from what its creators intended.
1461
1462              With the deemph effect, it is possible to  apply  the  necessary
1463              de-emphasis  to  audio that has been extracted from a pre-empha‐
1464              sised CD, and then either burn the de-emphasised audio to a  new
1465              CD  (which will then play correctly on any CD player), or simply
1466              play the correctly de-emphasised audio files  on  the  PC.   For
1467              example:
1468                 sox track1.wav track1-deemph.wav deemph
1469              and then burn track1-deemph.wav to CD, or
1470                 play track1-deemph.wav
1471              or simply
1472                 play track1.wav deemph
1473              The  de-emphasis  filter is implemented as a biquad; its maximum
1474              deviation from the ideal response is only 0.06dB (up to 20kHz).
1475
1476              This effect supports the --plot global option.
1477
1478              See also the bass and treble shelving equalisation effects.
1479
1480       delay {length}
1481              Delay one or more audio channels.  length can specify a time or,
1482              if  appended  with  an `s', a number of samples.  Do not specify
1483              both time and samples delays in the same command.  For  example,
1484              delay  1.5  0  0.5  delays the first channel by 1.5 seconds, the
1485              third channel by 0.5 seconds, and leaves the second channel (and
1486              any other channels that may be present) un-delayed.  The follow‐
1487              ing (one long) command plays a chime sound:
1488                 play -n synth -j 3 sin %3 sin %-2 sin %-5 sin %-9 \
1489                   sin %-14 sin %-21 fade h .01 2 1.5 delay \
1490                   1.3 1 .76 .54 .27 remix - fade h 0 2.7 2.5 norm -1
1491              and this plays a guitar chord:
1492                 play -n synth pl G2 pl B2 pl D3 pl G3 pl D4 pl G4 \
1493                   delay 0 .05 .1 .15 .2 .25 remix - fade 0 4 .1 norm -1
1494
1495       dither [-a] [-S|-s|-f filter]
1496              Apply dithering to the audio.   Dithering  deliberately  adds  a
1497              small  amount  of  noise  to the signal in order to mask audible
1498              quantization effects that can occur if the output sample size is
1499              less than 24 bits.  With no options, this effect will add trian‐
1500              gular (TPDF) white noise.  Noise-shaping (only for certain  sam‐
1501              ple  rates)  can be selected with -s.  With the -f option, it is
1502              possible to select a particular noise-shaping  filter  from  the
1503              following   list:   lipshitz,  f-weighted,  modified-e-weighted,
1504              improved-e-weighted, gesemann, shibata,  low-shibata,  high-shi‐
1505              bata.   Note  that  most  filter  types  are available only with
1506              44100Hz sample rate.  The filter types are distinguished by  the
1507              following  properties: audibility of noise, level of (inaudible,
1508              but in some circumstances, otherwise  problematic)  shaped  high
1509              frequency noise, and processing speed.
1510              See  http://sox.sourceforge.net/SoX/NoiseShaping  for  graphs of
1511              the different noise-shaping curves.
1512
1513              The -S option selects a slightly `sloped' TPDF,  biased  towards
1514              higher  frequencies.   It  can  be used at any sampling rate but
1515              below ≈22k, plain TPDF is probably  better,  and  above  ≈  37k,
1516              noise-shaped is probably better.
1517
1518              The  -a option enables a mode where dithering (and noise-shaping
1519              if applicable) are automatically enabled only when needed.   The
1520              most  likely  use for this is when applying fade in or out to an
1521              already dithered file, so that the redithering applies  only  to
1522              the  faded portions.  However, auto dithering is not fool-proof,
1523              so the fades should be carefully checked for any  noise  modula‐
1524              tion;  if  this occurs, then either re-dither the whole file, or
1525              use trim, fade, and concatencate.
1526
1527              If the SoX global option  -R  option  is  not  given,  then  the
1528              pseudo-random  number generator used to generate the white noise
1529              will be `reseeded', i.e. the generated noise will  be  different
1530              between invocations.
1531
1532              This  effect  should  not  be  followed by any other effect that
1533              affects the audio.
1534
1535              See also the `Dither' section above.
1536
1537       earwax Makes audio easier to listen to on headphones.  Adds  `cues'  to
1538              44.1kHz  stereo  (i.e.  audio CD format) audio so that when lis‐
1539              tened to on headphones the stereo image  is  moved  from  inside
1540              your  head  (standard for headphones) to outside and in front of
1541              the listener (standard  for  speakers).   See  http://www.geoci
1542              ties.com/beinges for a full explanation.
1543
1544       echo gain-in gain-out <delay decay>
1545              Add  echoing  to  the audio.  Echoes are reflected sound and can
1546              occur naturally amongst mountains (and  sometimes  large  build‐
1547              ings)  when  talking  or  shouting; digital echo effects emulate
1548              this behaviour and are often used to help fill out the sound  of
1549              a  single  instrument or vocal.  The time difference between the
1550              original signal and the reflection is the  `delay'  (time),  and
1551              the  loudness  of the reflected signal is the `decay'.  Multiple
1552              echoes can have different delays and decays.
1553
1554              Each given delay decay pair gives the delay in milliseconds  and
1555              the  decay  (relative to gain-in) of that echo.  Gain-out is the
1556              volume of the output.  For example: This will make it  sound  as
1557              if there are twice as many instruments as are actually playing:
1558                 play lead.aiff echo 0.8 0.88 60 0.4
1559              If  the delay is very short, then it sound like a (metallic) ro‐
1560              bot playing music:
1561                 play lead.aiff echo 0.8 0.88 6 0.4
1562              A longer delay will sound like an open air concert in the  moun‐
1563              tains:
1564                 play lead.aiff echo 0.8 0.9 1000 0.3
1565              One mountain more, and:
1566                 play lead.aiff echo 0.8 0.9 1000 0.3 1800 0.25
1567
1568       echos gain-in gain-out <delay decay>
1569              Add  a  sequence  of echoes to the audio.  Each delay decay pair
1570              gives the delay in milliseconds and the decay (relative to gain-
1571              in) of that echo.  Gain-out is the volume of the output.
1572
1573              Like  the echo effect, echos stand for `ECHO in Sequel', that is
1574              the first echos takes the input, the second the  input  and  the
1575              first  echos,  the  third the input and the first and the second
1576              echos, ... and so on.  Care should be taken using many echos;  a
1577              single echos has the same effect as a single echo.
1578
1579              The sample will be bounced twice in symmetric echos:
1580                 play lead.aiff echos 0.8 0.7 700 0.25 700 0.3
1581              The sample will be bounced twice in asymmetric echos:
1582                 play lead.aiff echos 0.8 0.7 700 0.25 900 0.3
1583              The sample will sound as if played in a garage:
1584                 play lead.aiff echos 0.8 0.7 40 0.25 63 0.3
1585
1586       equalizer frequency[k] width[q|o|h|k] gain
1587              Apply  a  two-pole  peaking equalisation (EQ) filter.  With this
1588              filter, the signal-level at and around a selected frequency  can
1589              be  increased  or  decreased, whilst (unlike band-pass and band-
1590              reject filters) that at all other frequencies is unchanged.
1591
1592              frequency gives the filter's central frequency in Hz, width, the
1593              band-width,  and  gain  the  required gain or attenuation in dB.
1594              Beware of Clipping when using a positive gain.
1595
1596              In order to produce complex equalisation curves, this effect can
1597              be given several times, each with a different central frequency.
1598
1599              The filter is described in detail in [1].
1600
1601              This effect supports the --plot global option.
1602
1603              See also bass and treble for shelving equalisation effects.
1604
1605       fade [type] fade-in-length [stop-time [fade-out-length]]
1606              Apply a fade effect to the beginning, end, or both of the audio.
1607
1608              An  optional  type  can  be specified to select the shape of the
1609              fade curve: q for quarter of a sine wave,  h  for  half  a  sine
1610              wave,  t for linear (`triangular') slope, l for logarithmic, and
1611              p for inverted parabola.  The default is logarithmic.
1612
1613              A fade-in starts from the first  sample  and  ramps  the  signal
1614              level  from 0 to full volume over fade-in-length seconds.  Spec‐
1615              ify 0 seconds if no fade-in is wanted.
1616
1617              For fade-outs, the audio will be truncated at stop-time and  the
1618              signal  level will be ramped from full volume down to 0 starting
1619              at fade-out-length seconds before the stop-time.   If  fade-out-
1620              length  is not specified, it defaults to the same value as fade-
1621              in-length.  No fade-out is performed if stop-time is not  speci‐
1622              fied.   If the file length can be determined from the input file
1623              header and length-changing effects are not in effect, then 0 may
1624              be specified for stop-time to indicate the usual case of a fade-
1625              out that ends at the end of the input audio stream.
1626
1627              All times can be specified in either periods of time  or  sample
1628              counts.   To  specify  time periods use the format hh:mm:ss.frac
1629              format.  To specify using sample counts, specify the  number  of
1630              samples and append the letter `s' to the sample count (for exam‐
1631              ple `8000s').
1632
1633              See also the splice effect.
1634
1635       fir [coefs-file|coefs]
1636              Use SoX's FFT convolution engine with given FIR  filter  coeffi‐
1637              cients.   If  a single argument is given then this is treated as
1638              the name of a file containing the  filter  coefficients  (white-
1639              space  separated; may contain `#' comments).  If the given file‐
1640              name is `-', or if no argument is given, then  the  coefficients
1641              are  read  from the `standard input' (stdin); otherwise, coeffi‐
1642              cients may be given on the command line.  Examples:
1643                 sox infile outfile fir 0.0195 -0.082 0.234 0.891 -0.145 0.043
1644                 sox infile outfile fir coefs.txt
1645              with coefs.txt containing
1646                 # HP filter
1647                 # freq=10000
1648                   1.2311233052619888e-01
1649                  -4.4777096106211783e-01
1650                   5.1031563346705155e-01
1651                  -6.6502926320995331e-02
1652                 ...
1653
1654       flanger [delay depth regen width speed shape phase interp]
1655              Apply a flanging effect to the audio.  See [3]  for  a  detailed
1656              description of flanging.
1657
1658              All parameters are optional (right to left).
1659
1660                        Range     Default   Description
1661              delay     0 - 30       0      Base delay in milliseconds.
1662              depth     0 - 10       2      Added swept delay in milliseconds.
1663              regen    -95 - 95      0      Percentage regeneration (delayed
1664                                            signal feedback).
1665              width    0 - 100      71      Percentage of delayed signal mixed
1666                                            with original.
1667              speed    0.1 - 10     0.5     Sweeps per second (Hz).
1668              shape                 sin     Swept wave shape: sine|triangle.
1669              phase    0 - 100      25      Swept wave percentage phase-shift
1670                                            for multi-channel (e.g. stereo)
1671                                            flange; 0 = 100 = same phase on
1672                                            each channel.
1673              interp                lin     Digital delay-line interpolation:
1674                                            linear|quadratic.
1675
1676       gain [-e|-B|-b|-r] [-n] [-l|-h] [gain-dB]
1677              Apply  amplification  or attenuation to the audio signal, or, in
1678              some cases, to some of its channels.  Note that use  of  any  of
1679              -e, -B, -b, -r, or -n requires temporary file space to store the
1680              audio to be  processed,  so  may  be  unsuitable  for  use  with
1681              `streamed' audio.
1682
1683              Without  other  options,  gain-dB  is  used to adjust the signal
1684              power level by  the  given  number  of  dB:  positive  amplifies
1685              (beware  of Clipping), negative attenuates.  With other options,
1686              the gain-dB amplification or attenuation is (logically)  applied
1687              after the processing due to those options.
1688
1689              Given  the  -e  option,  the  levels  of the audio channels of a
1690              multi-channel file are `equalised', i.e.  gain is applied to all
1691              channels  other than that with the highest peak level, such that
1692              all channels attain the same peak level (but, without also  giv‐
1693              ing -n, the audio is not `normalised').
1694
1695              The  -B  (balance) option is similar to -e, but with -B, the RMS
1696              level is used instead of the peak level.  -B might  be  used  to
1697              correct stereo imbalance caused by an imperfect record turntable
1698              cartridge.   Note that unlike -e, -B might cause some clipping.
1699
1700              -b is similar to -B but has clipping protection, i.e.  if neces‐
1701              sary  to  prevent  clipping  whilst  balancing,  attenuation  is
1702              applied to all channels.  Note,  however,  that  in  conjunction
1703              with -n, -B and -b are synonymous.
1704
1705              The  -r option is used in conjunction with a prior invocation of
1706              gain with the -h option - see below for details.
1707
1708              The -n option normalises the audio to 0dB FSD; it is often  used
1709              in  conjunction  with  a negative gain-dB to the effect that the
1710              audio is normalised to a given level below 0dB.  For example,
1711                 sox infile outfile gain -n
1712              normalises to 0dB, and
1713                 sox infile outfile gain -n -3
1714              normalises to -3dB.
1715
1716              The -l option invokes a simple limiter, e.g.
1717                 sox infile outfile gain -l 6
1718              will apply 6dB of gain but never clip.  Note that limiting  more
1719              than  a  few dBs more than occasionally (in a piece of audio) is
1720              not recommended as it can cause  audible  distortion.   See  the
1721              compand effect for a more capable limiter.
1722
1723              The  -h  option  is  used to apply gain to provide head-room for
1724              subsequent processing.  For example, with
1725                 sox infile outfile gain -h bass +6
1726              6dB of attenuation will be applied prior to  the  bass  boosting
1727              effect  thus  ensuring  that  it will not clip.  Of course, with
1728              bass, it is obvious how much headroom will be needed,  but  with
1729              other  effects  (e.g.   rate, dither) it is not always as clear.
1730              Another advantage of using  gain  -h  rather  than  an  explicit
1731              attenuation,  is  that if the headroom is not used by subsequent
1732              effects, it can be reclaimed with gain -r, for example:
1733                 sox infile outfile gain -h bass +6 rate 44100 gain -r
1734              The above effects chain guarantees never to clip nor amplify; it
1735              attenuates if necessary to prevent clipping, but by only as much
1736              as is needed to do so.
1737
1738              Output  formatting  (dithering  and  bit-depth  reduction)  also
1739              requires headroom (which cannot be `reclaimed'), e.g.
1740                 sox infile outfile gain -h bass +6 rate 44100 gain -rh dither
1741              Here,  the second gain invocation, reclaims as much of the head‐
1742              room as it can from the preceding effects, but retains  as  much
1743              headroom as is needed for subsequent processing.  The SoX global
1744              option -G can be given to automatically invoke gain -h and  gain
1745              -r.
1746
1747              See also the norm and vol effects.
1748
1749       highpass|lowpass [-1|-2] frequency[k] [width[q|o|h|k]]
1750              Apply  a  high-pass or low-pass filter with 3dB point frequency.
1751              The filter can be either single-pole (with -1),  or  double-pole
1752              (the  default,  or  with -2).  width applies only to double-pole
1753              filters; the default is  Q  =  0.707  and  gives  a  Butterworth
1754              response.  The filters roll off at 6dB per pole per octave (20dB
1755              per pole per decade).  The double-pole filters are described  in
1756              detail in [1].
1757
1758              These effects support the --plot global option.
1759
1760              See also sinc for filters with a steeper roll-off.
1761
1762       ladspa module [plugin] [argument...]
1763              Apply  a  LADSPA [5] (Linux Audio Developer's Simple Plugin API)
1764              plugin.  Despite the name, LADSPA is not Linux-specific,  and  a
1765              wide  range  of  effects is available as LADSPA plugins, such as
1766              cmt [6] (the Computer Music Toolkit) and Steve  Harris's  plugin
1767              collection  [7].  The  first  argument is the plugin module, the
1768              second the name of the plugin (a module can  contain  more  than
1769              one plugin) and any other arguments are for the control ports of
1770              the plugin. Missing arguments are supplied by default values  if
1771              possible.  Only  plugins  with  at  most one audio input and one
1772              audio output port can be used.  If found, the environment  vari‐
1773              able LADSPA_PATH will be used as search path for plugins.
1774
1775       loudness [gain [reference]]
1776              Loudness  control  -  similar  to  the gain effect, but provides
1777              equalisation   for   the    human    auditory    system.     See
1778              http://en.wikipedia.org/wiki/Loudness for a detailed description
1779              of loudness.  The gain is adjusted by the given  gain  parameter
1780              (usually negative) and the signal equalised according to ISO 226
1781              w.r.t. a reference level of 65dB, though an  alternative  refer‐
1782              ence level may be given if the original audio has been equalised
1783              for some other optimal level.  A default gain of -10dB  is  used
1784              if a gain value is not given.
1785
1786              See also the gain effect.
1787
1788       lowpass [-1|-2] frequency[k] [width[q|o|h|k]]
1789              Apply  a  low-pass  filter.  See the description of the highpass
1790              effect for details.
1791
1792       mcompand "attack1,decay1{,attack2,decay2}
1793              [soft-knee-dB:]in-dB1[,out-dB1]{,in-dB2,out-dB2}
1794              [gain    [initial-volume-dB    [delay]]]"     {crossover-freq[k]
1795              "attack1,..."}
1796
1797              The multi-band compander is similar to the single-band compander
1798              but the audio is first divided into bands  using  Linkwitz-Riley
1799              cross-over filters and a separately specifiable compander run on
1800              each band.  See the compand effect for  the  definition  of  its
1801              parameters.   Compand  parameters  are  specified between double
1802              quotes and the crossover frequency for that  band  is  given  by
1803              crossover-freq; these can be repeated to create multiple bands.
1804
1805              For  example,  the following (one long) command shows how multi-
1806              band companding is typically used in FM radio:
1807                 play track1.wav gain -3 sinc 8000- 29 100 mcompand \
1808                   "0.005,0.1 -47,-40,-34,-34,-17,-33" 100 \
1809                   "0.003,0.05 -47,-40,-34,-34,-17,-33" 400 \
1810                   "0.000625,0.0125 -47,-40,-34,-34,-15,-33" 1600 \
1811                   "0.0001,0.025 -47,-40,-34,-34,-31,-31,-0,-30" 6400 \
1812                   "0,0.025 -38,-31,-28,-28,-0,-25" \
1813                   gain 15 highpass 22 highpass 22 sinc -n 255 -b 16 -17500 \
1814                   gain 9 lowpass -1 17801
1815              The audio file is played with a simulated  FM  radio  sound  (or
1816              broadcast  signal  condition if the lowpass filter at the end is
1817              skipped).  Note that the pipeline is set up with  US-style  75us
1818              pre-emphasis.
1819
1820              See also compand for a single-band companding effect.
1821
1822       mixer [ -l|-r|-f|-b|-1|-2|-3|-4|n{,n} ]
1823              Reduce the number of audio channels by mixing or selecting chan‐
1824              nels, or increase the number of channels  by  duplicating  chan‐
1825              nels.   Note:  this effect operates on the audio channels within
1826              the SoX effects processing chain; it should not be confused with
1827              the  -m  global  option  (where  multiple files are mix-combined
1828              before entering the effects chain).
1829
1830              When reducing the number of channels it is possible to  use  the
1831              -l, -r, -f, -b, -1, -2, -3, -4, options to select only the left,
1832              right, front, back channel(s) or specific channel for the output
1833              instead  of averaging the channels.  The -l, and -r options will
1834              do averaging in quad-channel files so select the  exact  channel
1835              to prevent this.
1836
1837              The mixer effect can also be invoked with up to 16 numbers, sep‐
1838              arated by commas, which specify the proportion (0 = 0% and  1  =
1839              100%) of each input channel that is to be mixed into each output
1840              channel.  In two-channel mode, 4 numbers are given: l → l,  l  →
1841              r,  r  →  l, and r → r, respectively.  In four-channel mode, the
1842              first 4 numbers give the proportions for the  left-front  output
1843              channel,  as  follows:  lf  → lf, rf → lf, lb → lf, and rb → rf.
1844              The next 4 give the right-front output in the same  order,  then
1845              left-back and right-back.
1846
1847              It  is  also  possible to use the 16 numbers to expand or reduce
1848              the channel count; just specify 0 for unused channels.
1849
1850              Finally, certain reduced combination of numbers can be specified
1851              for certain input/output channel combinations.
1852
1853                   In Ch   Out Ch   Num   Mappings
1854                     2       1       2    l → l, r → l
1855                     2       2       1    adjust balance
1856                     4       1       4    lf → l, rf → l, lb → l, rb → l
1857                     4       2       2    lf → l&rf → r, lb → l&rb → r
1858                     4       4       1    adjust balance
1859                     4       4       2    front balance, back balance
1860
1861              See  also  remix  for a mixing effect that handles any number of
1862              channels.
1863
1864       noiseprof [profile-file]
1865              Calculate a profile of the audio for  use  in  noise  reduction.
1866              See the description of the noisered effect for details.
1867
1868       noisered [profile-file [amount]]
1869              Reduce  noise  in  the  audio signal by profiling and filtering.
1870              This effect is moderately effective at removing consistent back‐
1871              ground noise such as hiss or hum.  To use it, first run SoX with
1872              the noiseprof effect on a section of audio  that  ideally  would
1873              contain  silence  but in fact contains noise - such sections are
1874              typically found at the beginning or  the  end  of  a  recording.
1875              noiseprof  will write out a noise profile to profile-file, or to
1876              stdout if no profile-file or if `-' is given.  E.g.
1877                 sox speech.wav -n trim 0 1.5 noiseprof speech.noise-profile
1878              To actually remove the noise, run SoX again, this time with  the
1879              noisered effect; noisered will reduce noise according to a noise
1880              profile (which was generated by noiseprof),  from  profile-file,
1881              or from stdin if no profile-file or if `-' is given.  E.g.
1882                 sox speech.wav cleaned.wav noisered speech.noise-profile 0.3
1883              How much noise should be removed is specified by amount-a number
1884              between 0 and 1 with a default  of  0.5.   Higher  numbers  will
1885              remove  more  noise but present a greater likelihood of removing
1886              wanted components of the  audio  signal.   Before  replacing  an
1887              original recording with a noise-reduced version, experiment with
1888              different amount values to find the optimal one for your  audio;
1889              use  headphones  to  check  that you are happy with the results,
1890              paying particular attention to quieter sections of the audio.
1891
1892              On most systems, the two stages - profiling and reduction -  can
1893              be combined using a pipe, e.g.
1894                 sox noisy.wav -n trim 0 1 noiseprof | play noisy.wav noisered
1895
1896       norm [dB-level]
1897              Normalise the audio.  norm is just an alias for gain -n; see the
1898              gain effect for details.
1899
1900              Note that norm's -i and -b options are deprecated  (having  been
1901              superseded  by  gain  -en  and gain -B respectively) and will be
1902              removed in a future release.
1903
1904       oops   Out Of Phase Stereo effect.  Mixes  stereo  to  twin-mono  where
1905              each  mono  channel contains the difference between the left and
1906              right stereo channels.  This is sometimes known as the `karaoke'
1907              effect as it often has the effect of removing most or all of the
1908              vocals from a recording.
1909
1910       overdrive [gain(20) [colour(20)]]
1911              Non linear distortion.  The colour parameter controls the amount
1912              of even harmonic content in the over-driven output.
1913
1914       pad { length[@position] }
1915              Pad  the  audio  with silence, at the beginning, the end, or any
1916              specified points through the audio.  Both  length  and  position
1917              can specify a time or, if appended with an `s', a number of sam‐
1918              ples.  length is the amount of silence to  insert  and  position
1919              the  position  in  the input audio stream at which to insert it.
1920              Any number of lengths and positions may be  specified,  provided
1921              that  a  specified  position  is not less that the previous one.
1922              position is optional for the first and  last  lengths  specified
1923              and  if  omitted  correspond to the beginning and the end of the
1924              audio respectively.  For example, pad 1.5 1.5 adds  1.5  seconds
1925              of  silence  padding  at  each  end  of  the  audio,  whilst pad
1926              4000s@3:00 inserts 4000 samples of silence 3  minutes  into  the
1927              audio.  If silence is wanted only at the end of the audio, spec‐
1928              ify either the end position or specify a zero-length pad at  the
1929              start.
1930
1931              See  also delay for an effect that can add silence at the begin‐
1932              ning of the audio on a channel-by-channel basis.
1933
1934       phaser gain-in gain-out delay decay speed [-s|-t]
1935              Add a phasing effect to the  audio.   See  [3]  for  a  detailed
1936              description of phasing.
1937
1938              delay/decay/speed  gives the delay in milliseconds and the decay
1939              (relative to gain-in) with a modulation speed in Hz.  The  modu‐
1940              lation  is  either  sinusoidal  (-s)   - preferable for multiple
1941              instruments, or triangular (-t)  - gives  single  instruments  a
1942              sharper  phasing  effect.   The decay should be less than 0.5 to
1943              avoid feedback, and usually no less than 0.1.  Gain-out  is  the
1944              volume of the output.
1945
1946              For example:
1947                 play snare.flac phaser 0.8 0.74 3 0.4 0.5 -t
1948              Gentler:
1949                 play snare.flac phaser 0.9 0.85 4 0.23 1.3 -s
1950              A popular sound:
1951                 play snare.flac phaser 0.89 0.85 1 0.24 2 -t
1952              More severe:
1953                 play snare.flac phaser 0.6 0.66 3 0.6 2 -t
1954
1955       pitch [-q] shift [segment [search [overlap]]]
1956              Change the audio pitch (but not tempo).
1957
1958              shift  gives  the  pitch  shift  as positive or negative `cents'
1959              (i.e. 100ths of  a  semitone).   See  the  tempo  effect  for  a
1960              description of the other parameters.
1961
1962              See also the speed and tempo effects.
1963
1964       rate [-q|-l|-m|-h|-v] [override-options] RATE[k]
1965              Change  the audio sampling rate (i.e. resample the audio) to any
1966              given RATE (even non-integer if this is supported by the  output
1967              file format) using a quality level defined as follows:
1968
1969                           Quality   Band-   Rej dB   Typical Use
1970                                     width
1971                     -q     quick     n/a    ≈30 @    playback on
1972                                              Fs/4    ancient hardware
1973                     -l      low      80%     100     playback on old
1974                                                      hardware
1975                     -m    medium     95%     100     audio playback
1976                     -h     high      95%     125     16-bit mastering
1977                                                      (use with dither)
1978                     -v   very high   95%     175     24-bit mastering
1979
1980              where Band-width is the percentage of the audio  frequency  band
1981              that  is  preserved  and Rej dB is the level of noise rejection.
1982              Increasing levels of resampling quality come at the  expense  of
1983              increasing  amounts of time to process the audio.  If no quality
1984              option is given, the quality level used is `high'.
1985
1986              The `quick' algorithm uses cubic interpolation; all  others  use
1987              band-limited  interpolation.   By default, all algorithms have a
1988              `linear' phase response; for `medium', `high' and  `very  high',
1989              the phase response is configurable (see below).
1990
1991              The  rate  effect  is  invoked  automatically if SoX's -r option
1992              specifies a rate that is different to that of the input file(s).
1993              Alternatively, if this effect is given explicitly, then SoX's -r
1994              option need not be given.  For example, the following  two  com‐
1995              mands are equivalent:
1996                 sox input.wav -r 48k output.wav bass -3
1997                 sox input.wav        output.wav bass -3 rate 48k
1998              though  the  second  command  is more flexible as it allows rate
1999              options to be given, and allows the effects to be ordered  arbi‐
2000              trarily.
2001
2002                                    *        *        *
2003
2004              Warning: technically detailed discussion follows.
2005
2006              The  simple  quality selection described above provides settings
2007              that satisfy the needs of the vast majority of resampling tasks.
2008              Occasionally,  however,  it  may  be  desirable to fine-tune the
2009              resampler's filter response; this can be  achieved  using  over‐
2010              ride options, as detailed in the following table:
2011
2012              -M/-I/-L     Phase response = minimum/intermediate/linear
2013              -s           Steep filter (band-width = 99%)
2014              -a           Allow aliasing/imaging above the pass-band
2015              -b 74-99.7   Any band-width %
2016              -p 0-100     Any phase response (0 = minimum, 25 = intermediate,
2017                           50 = linear, 100 = maximum)
2018
2019              N.B.  Override options can not be used with the `quick' or `low'
2020              quality algorithms.
2021
2022              All  resamplers  use  filters  that  can sometimes create `echo'
2023              (a.k.a.  `ringing') artefacts with  transient  signals  such  as
2024              those  that occur with `finger snaps' or other highly percussive
2025              sounds.  Such artefacts are much more noticeable  to  the  human
2026              ear if they occur before the transient (`pre-echo') than if they
2027              occur after it (`post-echo').  Note that frequency of  any  such
2028              artefacts is related to the smaller of the original and new sam‐
2029              pling rates but that if this is at least 44.1kHz, then the arte‐
2030              facts will lie outside the range of human hearing.
2031
2032              A phase response setting may be used to control the distribution
2033              of any transient echo between `pre'  and  `post':  with  minimum
2034              phase, there is no pre-echo but the longest post-echo; with lin‐
2035              ear phase, pre and post echo are in  equal  amounts  (in  signal
2036              terms, but not audibility terms); the intermediate phase setting
2037              attempts to find the best compromise by selecting a small length
2038              (and level) of pre-echo and a medium lengthed post-echo.
2039
2040              Minimum,  intermediate,  or  linear  phase  response is selected
2041              using the -M, -I, or -L option; a custom phase response  can  be
2042              created  with  the -p option.  Note that phase responses between
2043              `linear' and `maximum' (greater than 50) are rarely useful.
2044
2045              A resampler's band-width setting determines how much of the fre‐
2046              quency  content of the original signal (w.r.t. the original sam‐
2047              ple rate when up-sampling, or the new sample rate when down-sam‐
2048              pling)  is preserved during conversion.  The term `pass-band' is
2049              used to refer to all frequencies  up  to  the  band-width  point
2050              (e.g.  for 44.1kHz sampling rate, and a resampling band-width of
2051              95%, the pass-band represents frequencies  from  0Hz  (D.C.)  to
2052              circa  21kHz).  Increasing the resampler's band-width results in
2053              a slower conversion and can increase  transient  echo  artefacts
2054              (and vice versa).
2055
2056              The  -s `steep filter' option changes resampling band-width from
2057              the default 95% (based on the 3dB point), to 99%.  The -b option
2058              allows  the  band-width  to  be  set  to  any value in the range
2059              74-99.7 %, but note that band-width values greater than 99%  are
2060              not recommended for normal use as they can cause excessive tran‐
2061              sient echo.
2062
2063              If the -a option is given, then aliasing/imaging above the pass-
2064              band is allowed.  For example, with 44.1kHz sampling rate, and a
2065              resampling band-width of 95%, this means that frequency  content
2066              above  21kHz  can be distorted; however, since this is above the
2067              pass-band (i.e.  above the highest frequency  of  interest/audi‐
2068              bility),  this  may  not be a problem.  The benefits of allowing
2069              aliasing/imaging are reduced processing time,  and  reduced  (by
2070              almost half) transient echo artefacts.  Note that if this option
2071              is  given,  then  the  minimum  band-width  allowable  with   -b
2072              increases to 85%.
2073
2074              Examples:
2075                 sox input.wav -b 16 output.wav rate -s -a 44100 dither -s
2076              default  (high)  quality  resampling;  overrides:  steep filter,
2077              allow aliasing; to 44.1kHz sample rate; noise-shaped  dither  to
2078              16-bit WAV file.
2079                 sox input.wav -b 24 output.aiff rate -v -I -b 90 48k
2080              very  high  quality  resampling;  overrides: intermediate phase,
2081              band-width 90%; to 48k sample rate; store output to 24-bit  AIFF
2082              file.
2083
2084                                    *        *        *
2085
2086              The  pitch,  speed  and tempo effects all use the rate effect at
2087              their core.
2088
2089       remix [-a|-m|-p] <out-spec>
2090              out-spec  = in-spec{,in-spec} | 0
2091              in-spec   = [in-chan][-[in-chan2]][vol-spec]
2092              vol-spec  = p|i|v[volume]
2093
2094              Select and mix input audio channels into output audio  channels.
2095              Each  output channel is specified, in turn, by a given out-spec:
2096              a list of contributing input channels and volume specifications.
2097
2098              Note that this effect operates on the audio channels within  the
2099              SoX effects processing chain; it should not be confused with the
2100              -m global option (where multiple files are  mix-combined  before
2101              entering the effects chain).
2102
2103              An  out-spec  contains comma-separated input channel-numbers and
2104              hyphen-delimited channel-number ranges; alternatively, 0 may  be
2105              given to create a silent output channel.  For example,
2106                 sox input.wav output.wav remix 6 7 8 0
2107              creates  an output file with four channels, where channels 1, 2,
2108              and 3 are copies of channels 6, 7, and 8 in the input file,  and
2109              channel 4 is silent.  Whereas
2110                 sox input.wav output.wav remix 1-3,7 3
2111              creates  a  (somewhat bizarre) stereo output file where the left
2112              channel is a mix-down of input channels 1, 2, 3, and 7, and  the
2113              right channel is a copy of input channel 3.
2114
2115              Where  a  range of channels is specified, the channel numbers to
2116              the left and right of the hyphen are optional and default  to  1
2117              and to the number of input channels respectively. Thus
2118                 sox input.wav output.wav remix -
2119              performs a mix-down of all input channels to mono.
2120
2121              By  default,  where an output channel is mixed from multiple (n)
2122              input channels, each input channel will be scaled by a factor of
2123              ¹/n.   Custom  mixing  volumes  can  be set by following a given
2124              input channel or range of input channels with a vol-spec (volume
2125              specification).  This is one of the letters p, i, or v, followed
2126              by a volume number, the meaning of which depends  on  the  given
2127              letter and is defined as follows:
2128
2129                      Letter   Volume number        Notes
2130                        p      power adjust in dB   0 = no change
2131                        i      power adjust in dB   As `p', but invert
2132                                                    the audio
2133                        v      voltage multiplier   1 = no change, 0.5
2134                                                    ≈ 6dB attenuation,
2135                                                    2 ≈ 6dB gain, -1 =
2136                                                    invert
2137
2138              If  an out-spec includes at least one vol-spec then, by default,
2139              ¹/n scaling is not applied to any other  channels  in  the  same
2140              out-spec (though may be in other out-specs).  The -a (automatic)
2141              option however, can be given to retain the automatic scaling  in
2142              this case.  For example,
2143                 sox input.wav output.wav remix 1,2 3,4v0.8
2144              results in channel level multipliers of 0.5,0.5 1,0.8, whereas
2145                 sox input.wav output.wav remix -a 1,2 3,4v0.8
2146              results in channel level multipliers of 0.5,0.5 0.5,0.8.
2147
2148              The  -m  (manual)  option  disables all automatic volume adjust‐
2149              ments, so
2150                 sox input.wav output.wav remix -m 1,2 3,4v0.8
2151              results in channel level multipliers of 1,1 1,0.8.
2152
2153              The volume number is optional and omitting it corresponds to  no
2154              volume change; however, the only case in which this is useful is
2155              in conjunction with i.  For example,  if  input.wav  is  stereo,
2156              then
2157                 sox input.wav output.wav remix 1,2i
2158              is a mono equivalent of the oops effect.
2159
2160              If  the  -p  option  is given, then any automatic ¹/n scaling is
2161              replaced by ¹/√n (`power') scaling; this gives a louder mix  but
2162              one that might occasionally clip.
2163
2164                                    *        *        *
2165
2166              One use of the remix effect is to split an audio file into a set
2167              of files, each containing one of the  constituent  channels  (in
2168              order to perform subsequent processing on individual audio chan‐
2169              nels).  Where more than a few channels are  involved,  a  script
2170              such as the following (Bourne shell script) is useful:
2171              #!/bin/sh
2172              chans=`soxi -c "$1"`
2173              while [ $chans -ge 1 ]; do
2174                 chans0=`printf %02i $chans`   # 2 digits hence up to 99 chans
2175                 out=`echo "$1"|sed "s/\(.*\)\.\(.*\)/\1-$chans0.\2/"`
2176                 sox "$1" "$out" remix $chans
2177                 chans=`expr $chans - 1`
2178              done
2179              If  a  file  input.wav containing six audio channels were given,
2180              the  script  would  produce  six  output  files:   input-01.wav,
2181              input-02.wav, ..., input-06.wav.
2182
2183              See also mixer and swap for similar effects.
2184
2185       repeat count
2186              Repeat  the  entire  audio count times.  Requires temporary file
2187              space to store the audio to be repeated.   Note  that  repeating
2188              once  yields  two  copies:  the  original audio and the repeated
2189              audio.
2190
2191       reverb [-w|--wet-only] [reverberance (50%) [HF-damping (50%)
2192              [room-scale (100%) [stereo-depth (100%)
2193              [pre-delay (0ms) [wet-gain (0dB)]]]]]]
2194
2195              Add reverberation to the audio using the  `freeverb'  algorithm.
2196              A  reverberation effect is sometimes desirable for concert halls
2197              that are too small or contain so many  people  that  the  hall's
2198              natural  reverberance is diminished.  Applying a small amount of
2199              stereo reverb to a (dry) mono signal will usually make it  sound
2200              more  natural.  See [3] for a detailed description of reverbera‐
2201              tion.
2202
2203              Note that this effect increases both the volume and  the  length
2204              of the audio, so to prevent clipping in these domains, a typical
2205              invocation might be:
2206                 play dry.wav gain -3 pad 0 3 reverb
2207              The -w option can be given to select only the `wet' signal, thus
2208              allowing  it to be processed further, independently of the `dry'
2209              signal.  E.g.
2210                 play -m voice.wav "|sox voice.wav -p reverse reverb -w reverse"
2211              for a reverse reverb effect.
2212
2213       reverse
2214              Reverse the audio completely.  Requires temporary file space  to
2215              store the audio to be reversed.
2216
2217       riaa   Apply  RIAA vinyl playback equalisation.  The sampling rate must
2218              be one of: 44.1, 48, 88.2, 96 kHz.
2219
2220              This effect supports the --plot global option.
2221
2222       silence [-l] above-periods [duration threshold[d|%]
2223              [below-periods duration threshold[d|%]]
2224
2225              Removes silence from the beginning, middle, or end of the audio.
2226              `Silence' is determined by a specified threshold.
2227
2228              The  above-periods  value is used to indicate if audio should be
2229              trimmed at the beginning of the audio. A value of zero indicates
2230              no silence should be trimmed from the beginning. When specifying
2231              an non-zero above-periods, it trims audio up until it finds non-
2232              silence. Normally, when trimming silence from beginning of audio
2233              the above-periods will be 1 but it can be  increased  to  higher
2234              values  to  trim all audio up to a specific count of non-silence
2235              periods. For example, if you had an audio file  with  two  songs
2236              that  each  contained  2 seconds of silence before the song, you
2237              could specify an above-period of 2 to  strip  out  both  silence
2238              periods and the first song.
2239
2240              When above-periods is non-zero, you must also specify a duration
2241              and threshold. Duration indications the amount of time that non-
2242              silence  must  be  detected  before  it stops trimming audio. By
2243              increasing the duration,  burst  of  noise  can  be  treated  as
2244              silence and trimmed off.
2245
2246              Threshold is used to indicate what sample value you should treat
2247              as silence.  For digital audio, a value of 0 may be fine but for
2248              audio  recorded  from analog, you may wish to increase the value
2249              to account for background noise.
2250
2251              When optionally trimming silence from the end of the audio,  you
2252              specify a below-periods count.  In this case, below-period means
2253              to remove all audio after silence is detected.   Normally,  this
2254              will  be a value 1 of but it can be increased to skip over peri‐
2255              ods of silence that are wanted.  For example, if you have a song
2256              with 2 seconds of silence in the middle and 2 second at the end,
2257              you could set below-period to a value of  2  to  skip  over  the
2258              silence in the middle of the audio.
2259
2260              For  below-periods,  duration specifies a period of silence that
2261              must exist before audio is not copied any more.  By specifying a
2262              higher  duration,  silence  that  is  wanted  can be left in the
2263              audio.  For example, if you have a song with an expected 1  sec‐
2264              ond  of  silence  in  the middle and 2 seconds of silence at the
2265              end, a duration of 2 seconds could be used to skip over the mid‐
2266              dle silence.
2267
2268              Unfortunately,  you  must  know the length of the silence at the
2269              end of your audio file to trim off  silence  reliably.   A  work
2270              around  is  to  use  the  silence effect in combination with the
2271              reverse effect.  By first reversing the audio, you can  use  the
2272              above-periods  to  reliably  trim all audio from what looks like
2273              the front of the file.  Then reverse the file again to get  back
2274              to normal.
2275
2276              To  remove  silence  from the middle of a file, specify a below-
2277              periods that is negative.  This value is then treated as a posi‐
2278              tive  value  and  is  also  used  to  indicate the effect should
2279              restart processing as specified by the above-periods, making  it
2280              suitable  for  removing  periods of silence in the middle of the
2281              audio.
2282
2283              The option -l indicates that below-periods  duration  length  of
2284              audio  should  be left intact at the beginning of each period of
2285              silence.  For example, if you want to remove long pauses between
2286              words but do not want to remove the pauses completely.
2287
2288              The  period  counts are in units of samples. Duration counts may
2289              be in the format of hh:mm:ss.frac, or the exact  count  of  sam‐
2290              ples.   Threshold numbers may be suffixed with d to indicate the
2291              value is in decibels, or % to indicate a percentage  of  maximum
2292              value of the sample value (0% specifies pure digital silence).
2293
2294              The following example shows how this effect can be used to start
2295              a recording that does not contain the delay at the  start  which
2296              usually  occurs  between  `pressing  the  record button' and the
2297              start of the performance:
2298                 rec parameters filename other-effects silence 1 5 2%
2299
2300       sinc [-a att|-b beta] [-p phase|-M|-I|-L] [-t tbw|-n taps] [fre‐
2301       qHP][-freqLP [-t tbw|-n taps]]
2302              Apply  a sinc kaiser-windowed low-pass, high-pass, band-pass, or
2303              band-reject filter to the signal.  The freqHP and freqLP parame‐
2304              ters  give  the frequencies of the 6dB points of a high-pass and
2305              low-pass filter that may be invoked individually,  or  together.
2306              If both are given, then freqHP < freqLP creates a band-pass fil‐
2307              ter, freqHP > freqLP creates a band-reject filter.
2308
2309              The default stop-band attenuation of  120dB  can  be  overridden
2310              with  -a;  alternatively, the kaiser-window `beta' parameter can
2311              be given directly with -b.
2312
2313              The default transition band-width of 5% of the total band can be
2314              overridden with -t (and tbw in Hertz); alternatively, the number
2315              of filter taps can be given directly with -n.
2316
2317              If both freqHP and freqLP are given, then  a  -t  or  -n  option
2318              given  to  the  left of the frequencies applies to both frequen‐
2319              cies; one of these options given to the right of the frequencies
2320              applies only to freqLP.
2321
2322              The  -p,  -M,  -I,  and  -L  options  control the filter's phase
2323              response; see the rate effect for details.
2324
2325              This effect supports the --plot global option.
2326
2327       spectrogram [options]
2328              Create a spectrogram of the audio; the audio is  passed  unmodi‐
2329              fied  through the SoX processing chain.  This effect is optional
2330              - type sox --help and check the list of supported effects to see
2331              if it has been included.
2332
2333              The  spectrogram is rendered in a Portable Network Graphic (PNG)
2334              file, and shows time in the X-axis, frequency in the Y-axis, and
2335              audio  signal magnitude in the Z-axis.  Z-axis values are repre‐
2336              sented by the colour (or optionally the intensity) of the pixels
2337              in  the  X-Y plane.  If the audio signal contains multiple chan‐
2338              nels then these are shown from top to bottom starting from chan‐
2339              nel 1 (which is the left channel for stereo audio).
2340
2341              For example, if `my.wav' is a stereo file, then with
2342                 sox my.wav -n spectrogram
2343              a  spectrogram  of  the  entire file will be created in the file
2344              `spectrogram.png'.  More often though,  analysis  of  a  smaller
2345              portion of the audio is required; e.g. with
2346                 sox my.wav -n remix 2 trim 20 30 spectrogram
2347              the  spectrogram  shows information only from the second (right)
2348              channel, and of thirty seconds of  audio  starting  from  twenty
2349              seconds in.  To analyse a small portion of the frequency domain,
2350              the rate effect may be used, e.g.
2351                 sox my.wav -n rate 6k spectrogram
2352              allows detailed analysis of frequencies up  to  3kHz  (half  the
2353              sampling rate) i.e. where the human auditory system is most sen‐
2354              sitive.  With
2355                 sox my.wav -n trim 0 10 spectrogram -x 600 -y 200 -z 100
2356              the given options control the size of the spectrogram's X, Y & Z
2357              axes  (in  this case, the spectrogram area of the produced image
2358              will be 600 by 200 pixels in size and the Z-axis range  will  be
2359              100  dB).   Note  that  the produced image includes axes legends
2360              etc. and so will be a little larger than the specified  spectro‐
2361              gram size.  In this example:
2362                 sox -n -n synth 6 tri 10k:14k spectrogram -z 100 -w kaiser
2363              an analysis `window' with high dynamic range is selected to best
2364              display the spectrogram of a swept triangular wave.  For a  smi‐
2365              lar  example, append the following to the `chime' command in the
2366              description of the delay effect (above):
2367                 rate 2k spectrogram -X 200 -Z -10 -w kaiser
2368              Options are also avaliable to control  the  appearance  (colour-
2369              set,  brightness,  contrast,  etc.) and filename of the spectro‐
2370              gram; e.g. with
2371                 sox my.wav -n spectrogram -m -l -o print.png
2372              a spectrogram is created suitable for printing on a  `black  and
2373              white' printer.
2374
2375              Options:
2376
2377              -x num Change  the  (maximum)  width (X-axis) of the spectrogram
2378                     from its default value of 800 pixels to  a  given  number
2379                     between 100 and 5000.  See also -X and -d.
2380
2381              -X num X-axis  pixels/second;  the default is auto-calculated to
2382                     fit the given or known audio duration to the X-axis size,
2383                     or  100 otherwise.  If given in conjunction with -d, this
2384                     option affects the width of the  spectrogram;  otherwise,
2385                     it  affects  the duration of the spectrogram.  num can be
2386                     from 1 (low time resolution) to 5000 (high  time  resolu‐
2387                     tion)  and need not be an integer.  SoX may make a slight
2388                     adjustment to the given number for  processing  quantisa‐
2389                     tion  reasons;  if  so, SoX will report the actual number
2390                     used (viewable when  the  SoX  global  option  -V  is  in
2391                     effect).  See also -x and -d.
2392
2393              -y num Sets the Y-axis size in pixels (per channel); this is the
2394                     number of frequency `bins' used in the  Fourier  analysis
2395                     that  produces  the  spectrogram.  N.B. it can be slow to
2396                     produce the spectrogram if this number is  not  one  more
2397                     than  a  power  of two (e.g. 129).  By default the Y-axis
2398                     size is chosen automatically (depending on the number  of
2399                     channels).   See  -Y for alternative way of setting spec‐
2400                     trogram height.
2401
2402              -Y num Sets the target total height of the spectrogram(s).   The
2403                     default  value  is 550 pixels.  Using this option (and by
2404                     default), SoX will choose a height for  individual  spec‐
2405                     trogram channels that is one more than a power of two, so
2406                     the actual total height may fall short of the given  num‐
2407                     ber.  However, there is also a minimum height per channel
2408                     so  if  there  are  many  channels,  the  number  may  be
2409                     exceeded.  See -y for alternative way of setting spectro‐
2410                     gram height.
2411
2412              -z num Z-axis (colour) range in dB, default 120.  This sets  the
2413                     dynamic-range  of  the  spectrogram  to  be  -num dBFS to
2414                     0 dBFS.  Num  may  range  from  20  to  180.   Decreasing
2415                     dynamic-range effectively increases the `contrast' of the
2416                     spectrogram display, and vice versa.
2417
2418              -Z num Sets the upper limit of the Z-axis in dBFS.   A  negative
2419                     num  effectively  increases the `brightness' of the spec‐
2420                     trogram display, and vice versa.
2421
2422              -q num Sets the Z-axis quantisation, i.e. the number of  differ‐
2423                     ent  colours  (or  intensities) in which to render Z-axis
2424                     values.   A  small  number   (e.g.   4)   will   give   a
2425                     `poster'-like  effect  making it easier to discern magni‐
2426                     tude bands of similar level.  Small numbers also  usually
2427                     result  in  small  PNG files.  The number given specifies
2428                     the number of colours to use inside the Z-axis range; two
2429                     colours are reserved to represent out-of-range values.
2430
2431              -w name
2432                     Window: Hann (default), Hamming, Bartlett, Rectangular or
2433                     Kaiser.  The spectrogram is produced using  the  Discrete
2434                     Fourier Transform (DFT) algorithm.  A significant parame‐
2435                     ter to this algorithm is the choice of `window function'.
2436                     By  default, SoX uses the Hann window which has good all-
2437                     round frequency-resolution and dynamic-range  properties.
2438                     For  better  frequency  resolution  (but  lower  dynamic-
2439                     range), select a Hamming window; for higher dynamic-range
2440                     (but  poorer  frequency-resolution), select a Kaiser win‐
2441                     dow.  Bartlett and Rectangular windows  are  also  avail‐
2442                     able.
2443
2444              -W num Window  adjustment  parameter.   This can be used to make
2445                     small adjustments to the Kaiser window shape.  A positive
2446                     number  (up  to ten) increases its dynamic range, a nega‐
2447                     tive number decreases it.
2448
2449              -s     Allow slack overlapping of DFT  windows.   This  can,  in
2450                     some  cases,  increase  image  sharpness and give greater
2451                     adherence to the -x value, but at the expense of a little
2452                     spectral loss.
2453
2454              -m     Creates a monochrome spectrogram (the default is colour).
2455
2456              -h     Selects  a  high-colour  palette - less visually pleasing
2457                     than the default colour palette, but it may make it  eas‐
2458                     ier to differentiate different levels.  If this option is
2459                     used in conjunction with -m, the result will be a  hybrid
2460                     monochrome/colour palette.
2461
2462              -p num Permute  the  colours in a colour or hybrid palette.  The
2463                     num parameter, from 1 (the default)  to  6,  selects  the
2464                     permutation.
2465
2466              -l     Creates  a  `printer  friendly'  spectrogram with a light
2467                     background (the default has a dark background).
2468
2469              -a     Suppress the display of the axis lines.   This  is  some‐
2470                     times useful in helping to discern artefacts at the spec‐
2471                     trogram edges.
2472
2473              -r     Raw spectrogram: suppress the display of  axes  and  leg‐
2474                     ends.
2475
2476              -A     Selects  an  alternative, fixed colour-set.  This is pro‐
2477                     vided only for compatibility with  spectrograms  produced
2478                     by another package.  It should not normally be used as it
2479                     has some problems, not least, a lack  of  differentiation
2480                     at  the  bottom end which results in masking of low-level
2481                     artefacts.
2482
2483              -t text
2484                     Set the image title - text to display above the  spectro‐
2485                     gram.
2486
2487              -c text
2488                     Set  (or clear) the image comment - text to display below
2489                     and to the left of the spectrogram.
2490
2491              -o text
2492                     Name of the spectrogram output PNG file,  default  `spec‐
2493                     trogram.png'.
2494
2495              Advanced Options:
2496              In order to process a smaller section of audio without affecting
2497              other effects or the output signal (unlike when the trim  effect
2498              is used), the following options may be used.
2499
2500              -d duration
2501                     This  option  sets  the X-axis resolution such that audio
2502                     with the given duration ([[HH:]MM:]SS) fits the  selected
2503                     (or default) X-axis width.  For example,
2504                        sox input.mp3 output.wav -n spectrogram -d 1:00 stats
2505                     creates  a  spectrogram  showing  the first minute of the
2506                     audio, whilst
2507                     the stats effect is applied to the entire audio signal.
2508
2509                     See also -X for an alternative way of setting the  X-axis
2510                     resolution.
2511
2512              -S time
2513                     Start  the  spectrogram  at  the given point in the audio
2514                     stream.  For example
2515                        sox input.aiff output.wav spectrogram -S 1:00
2516                     creates a spectrogram showing all but the first minute of
2517                     the  audio  (the output file however, receives the entire
2518                     audio stream).
2519
2520              For the ability to perform off-line processing of spectral data,
2521              see the stat effect.
2522
2523       speed factor[c]
2524              Adjust  the  audio  speed (pitch and tempo together).  factor is
2525              either the ratio of the new speed to the old speed: greater than
2526              1  speeds  up,  less than 1 slows down, or, if appended with the
2527              letter `c', the number of cents (i.e. 100ths of a  semitone)  by
2528              which  the  pitch (and tempo) should be adjusted: greater than 0
2529              increases, less than 0 decreases.
2530
2531              By default, the speed change is performed by resampling with the
2532              rate effect using its default quality/speed.  For higher quality
2533              or higher speed resampling, in addition  to  the  speed  effect,
2534              specify the rate effect with the desired quality option.
2535
2536              See also the pitch and tempo effects.
2537
2538       splice  [-h|-t|-q] { position[,excess[,leeway]] }
2539              Splice together audio sections.  This effect provides two things
2540              over simple audio concatenation: a (usually short) cross-fade is
2541              applied at the join, and a wave similarity comparison is made to
2542              help determine the best place at which to make the join.
2543
2544              One of the options -h, -t, or -q may be given to select the fade
2545              envelope  as  triangular  (a.k.a.  linear)  (the default), half-
2546              cosine wave, or quarter-cosine wave respectively.
2547
2548                     Type   Audio          Fade level       Transitions
2549                      t     correlated     constant gain    abrupt
2550                      h     correlated     constant gain    smooth
2551                      q     uncorrelated   constant power   smooth
2552
2553              To perform a splice, first use the trim  effect  to  select  the
2554              audio sections to be joined together.  As when performing a tape
2555              splice, the end of the section to  be  spliced  onto  should  be
2556              trimmed  with  a  small  excess (default 0.005 seconds) of audio
2557              after the ideal joining point.  The beginning of the audio  sec‐
2558              tion to splice on should be trimmed with the same excess (before
2559              the ideal joining point), plus  an  additional  leeway  (default
2560              0.005  seconds).   SoX should then be invoked with the two audio
2561              sections as input files and the splice  effect  given  with  the
2562              position  at which to perform the splice - this is length of the
2563              first audio section (including the excess).
2564
2565              For example, a long song begins with two verses which start  (as
2566              determined  e.g. by using the play command with the trim (start)
2567              effect) at times 0:30.125 and 1:03.432.  The following  commands
2568              cut out the first verse:
2569                 sox too-long.wav part1.wav trim 0 30.130
2570              (5 ms excess, after the first verse starts)
2571                 sox too-long.wav part2.wav trim 1:03.422
2572              (5 ms excess plus 5 ms leeway, before the second verse starts)
2573                 sox part1.wav part2.wav just-right.wav splice 30.130
2574              For another example, the SoX command
2575                 play "|sox -n -p synth 1 sin %1" "|sox -n -p synth 1 sin %3"
2576              generates and plays two notes, but there is a nasty click at the
2577              transition; the click can be removed by splicing instead of con‐
2578              catenating the audio, i.e. by appending splice 1 to the command.
2579              (Clicks at the beginning and end of the audio can be removed  by
2580              preceding the splice effect with fade q .01 2 .01).
2581
2582              Provided your arithmetic is good enough, multiple splices can be
2583              performed with a single splice invocation.  For example:
2584              #!/bin/sh
2585              # Audio Copy and Paste Over
2586              # acpo infile copy-start copy-stop paste-over-start outfile
2587              # All times measured in samples.
2588              rate=`soxi -r "$1"`
2589              e=`expr $rate '*' 5 / 1000`  # Using default excess
2590              l=$e                         # and leeway.
2591              sox "$1" piece.wav trim `expr $2 - $e - $l`s \
2592                 `expr $3 - $2 + $e + $l + $e`s
2593              sox "$1" part1.wav trim 0 `expr $4 + $e`s
2594              sox "$1" part2.wav trim `expr $4 + $3 - $2 - $e - $l`s
2595              sox part1.wav piece.wav part2.wav "$5" splice \
2596                 `expr $4 + $e`s \
2597                 `expr $4 + $e + $3 - $2 + $e + $l + $e`s
2598              In the above Bourne shell script, two splices are used to  `copy
2599              and paste' audio.
2600
2601                                    *        *        *
2602
2603              It is also possible to use this effect to perform general cross-
2604              fades, e.g. to join two songs.  In this case, excess would typi‐
2605              cally  be an number of seconds, the -q option would typically be
2606              given (to select an `equal power' cross-fade), and leeway should
2607              be  zero (which is the default if -q is given).  For example, if
2608              f1.wav and f2.wav are audio files to be cross-faded, then
2609                 sox f1.wav f2.wav out.wav splice -q $(soxi -D f1.wav),3
2610              cross-fades the files where the point of  equal  loudness  is  3
2611              seconds  before  the end of f1.wav, i.e. the total length of the
2612              cross-fade is 2 × 3 = 6 seconds (Note: the  $(...)  notation  is
2613              POSIX shell).
2614
2615       stat [-s scale] [-rms] [-freq] [-v] [-d]
2616              Display  time and frequency domain statistical information about
2617              the audio.  Audio is passed unmodified through the SoX  process‐
2618              ing chain.
2619
2620              The  information  is  output  to  the  `standard error' (stderr)
2621              stream and is calculated, where n is the duration of  the  audio
2622              in  samples,  c  is the number of audio channels, r is the audio
2623              sample rate, and xk represents the PCM value (in the range -1 to
2624              +1  by  default) of each successive sample in the audio, as fol‐
2625              lows:
2626
2627               Samples read        n×c
2628               Length (seconds)    n÷r
2629               Scaled by                                 See -s below.
2630               Maximum amplitude   max(xk)               The maximum  sample
2631                                                         value in the audio;
2632                                                         usually  this  will
2633                                                         be  a positive num‐
2634                                                         ber.
2635               Minimum amplitude   min(xk)               The minimum  sample
2636                                                         value in the audio;
2637                                                         usually  this  will
2638                                                         be  a negative num‐
2639                                                         ber.
2640               Midline amplitude   ½min(xk)+½max(xk)
2641               Mean norm           ¹/nΣ│xk│              The average of  the
2642                                                         absolute  value  of
2643                                                         each sample in  the
2644                                                         audio.
2645               Mean amplitude      ¹/nΣxk                The average of each
2646                                                         sample    in    the
2647                                                         audio.    If   this
2648                                                         figure is non-zero,
2649                                                         then  it  indicates
2650                                                         the presence  of  a
2651                                                         D.C.  offset (which
2652                                                         could  be   removed
2653                                                         using  the  dcshift
2654                                                         effect).
2655               RMS amplitude       √(¹/nΣxk²)            The level of a D.C.
2656                                                         signal  that  would
2657                                                         have the same power
2658                                                         as    the   audio's
2659                                                         average power.
2660               Maximum delta       max(│xk-xk-1│)
2661               Minimum delta       min(│xk-xk-1│)
2662               Mean delta          ¹/n-1Σ│xk-xk-1
2663               RMS delta           √(¹/n-1Σ(xk-xk-1)²)
2664
2665               Rough frequency                           In Hz.
2666               Volume Adjustment                         The  parameter   to
2667                                                         the    vol   effect
2668                                                         which  would   make
2669                                                         the  audio  as loud
2670                                                         as possible without
2671                                                         clipping.     Note:
2672                                                         See the  discussion
2673                                                         on  Clipping  above
2674                                                         for reasons why  it
2675                                                         is  rarely  a  good
2676                                                         idea actually to do
2677                                                         this.
2678
2679              Note  that  the delta measurements are not applicable for multi-
2680              channel audio.
2681
2682              The -s option can be used to scale the input  data  by  a  given
2683              factor.  The default value of scale is 2147483647 (i.e. the max‐
2684              imum value of a 32-bit signed integer).  Internal effects always
2685              work with signed long PCM data and so the value should relate to
2686              this fact.
2687
2688              The -rms option will convert all output average values to  `root
2689              mean square' format.
2690
2691              The -v option displays only the `Volume Adjustment' value.
2692
2693              The  -freq  option  calculates  the input's power spectrum (4096
2694              point DFT) instead of the statistics listed above.  This  should
2695              only be used with a single channel audio file.
2696
2697              The  -d option displays a hex dump of the 32-bit signed PCM data
2698              audio in SoX's internal buffer.  This is  mainly  used  to  help
2699              track  down  endian problems that sometimes occur in cross-plat‐
2700              form versions of SoX.
2701
2702              See also the stats effect.
2703
2704       stats [-b bits|-x bits|-s scale] [-w window-time]
2705              Display time domain  statistical  information  about  the  audio
2706              channels;  audio is passed unmodified through the SoX processing
2707              chain.  Statistics are calculated and displayed for  each  audio
2708              channel and, where applicable, an overall figure is also given.
2709
2710              For example, for a typical well-mastered stereo music file:
2711
2712                                       Overall     Left      Right
2713                          DC offset   0.000803 -0.000391  0.000803
2714                          Min level  -0.750977 -0.750977 -0.653412
2715                          Max level   0.708801  0.708801  0.653534
2716                          Pk lev dB      -2.49     -2.49     -3.69
2717                          RMS lev dB    -19.41    -19.13    -19.71
2718                          RMS Pk dB     -13.82    -13.82    -14.38
2719                          RMS Tr dB     -85.25    -85.25    -82.66
2720                          Crest factor       -      6.79      6.32
2721                          Flat factor     0.00      0.00      0.00
2722                          Pk count           2         2         2
2723                          Bit-depth      16/16     16/16     16/16
2724                          Num samples    7.72M
2725                          Length s     174.973
2726                          Scale max   1.000000
2727                          Window s       0.050
2728
2729              DC offset,  Min level,  and  Max level are shown, by default, in
2730              the range ±1.  If the -b (bits) options  is  given,  then  these
2731              three  measurements  will be scaled to a signed integer with the
2732              given number of bits; for example, for 16 bits, the scale  would
2733              be  -32768  to +32767.  The -x option behaves the same way as -b
2734              except that the signed integer values are displayed in hexadeci‐
2735              mal.   The  -s  option  scales the three measurements by a given
2736              floating-point number.
2737
2738              Pk lev dB and RMS lev dB are standard peak and  RMS  level  mea‐
2739              sured in dBFS.  RMS Pk dB and RMS Tr dB are peak and trough val‐
2740              ues for RMS level measured over a short window (default 50ms).
2741
2742              Crest factor is the standard ratio of peak to RMS  level  (note:
2743              not in dB).
2744
2745              Flat factor  is a measure of the flatness (i.e. consecutive sam‐
2746              ples with the same value) of the signal at its peak levels (i.e.
2747              either  Min level,  or  Max level).   Pk count  is the number of
2748              occasions (not the number of samples) that the  signal  attained
2749              either Min level, or Max level.
2750
2751              The  right-hand  Bit-depth  figure is the standard definition of
2752              bit-depth i.e. bits less significant than the given  number  are
2753              fixed  at zero.  The left-hand figure is the number of most sig‐
2754              nificant bits that are fixed at zero (or one for  negative  num‐
2755              bers)  subtracted  from  the  right-hand figure (the number sub‐
2756              tracted is directly related to Pk lev dB).
2757
2758              For multi-channel audio, an overall figure for each of the above
2759              measurements  is  given  and derived from the channel figures as
2760              follows: DC offset:  maximum  magnitude;  Max level,  Pk lev dB,
2761              RMS Pk dB,  Bit-depth:  maximum;  Min level, RMS Tr dB: minimum;
2762              RMS lev dB, Flat factor, Pk count:  average;  Crest factor:  not
2763              applicable.
2764
2765              Length s  is  the duration in seconds of the audio, and Num sam‐
2766              ples  is  equal  to  the  sample-rate  multiplied   by   Length.
2767              Scale Max  is  the  scaling  applied to the first three measure‐
2768              ments; specifically, it is the maximum value that could apply to
2769              Max level.   Window s  is  the length of the window used for the
2770              peak and trough RMS measurements.
2771
2772              See also the stat effect.
2773
2774       swap   Swap stereo channels.  See also remix for an effect that  allows
2775              arbitrary channel selection and ordering (and mixing).
2776
2777       stretch factor [window fade shift fading]
2778              Change  the  audio duration (but not its pitch).  This effect is
2779              broadly equivalent to the tempo  effect  with  (factor  inverted
2780              and) search set to zero, so in general, its results are compara‐
2781              tively poor; it is retained  as  it  can  sometimes  out-perform
2782              tempo for small factors.
2783
2784              factor  of stretching: >1 lengthen, <1 shorten duration.  window
2785              size is in ms.  Default is 20ms.  The fade option, can be `lin'.
2786              shift  ratio, in [0 1].  Default depends on stretch factor. 1 to
2787              shorten, 0.8 to lengthen.  The fading ratio, in  [0  0.5].   The
2788              amount of a fade's default depends on factor and shift.
2789
2790              See also the tempo effect.
2791
2792       synth [-j KEY] [-n] [len [off [ph [p1 [p2 [p3]]]]]] {[type] [combine]
2793       [[%]freq[k][:|+|/|-[%]freq2[k]]] [off [ph [p1 [p2 [p3]]]]]}
2794              This effect can be used to generate  fixed  or  swept  frequency
2795              audio  tones  with various wave shapes, or to generate wide-band
2796              noise of various `colours'.  Multiple synth effects can be  cas‐
2797              caded  to  produce  more  complex waveforms; at each stage it is
2798              possible to choose whether the generated waveform will be  mixed
2799              with,  or  modulated  onto  the  output from the previous stage.
2800              Audio for each channel in a multi-channel audio file can be syn‐
2801              thesised independently.
2802
2803              Though this effect is used to generate audio, an input file must
2804              still be given, the characteristics of which will be used to set
2805              the  synthesised  audio  length, the number of channels, and the
2806              sampling rate; however, since the input file's audio is not nor‐
2807              mally  needed, a `null file' (with the special name -n) is often
2808              given instead (and the length specified as a parameter to  synth
2809              or by another given effect that can has an associated length).
2810
2811              For  example,  the  following  produces a 3 second, 48kHz, audio
2812              file containing a sine-wave swept from 300 to 3300 Hz:
2813                 sox -n output.wav synth 3 sine 300-3300
2814              and this produces an 8 kHz version:
2815                 sox -r 8000 -n output.wav synth 3 sine 300-3300
2816              Multiple channels can be synthesised by specifying  the  set  of
2817              parameters  shown  between  braces multiple times; the following
2818              puts the swept tone in the left channel and adds  `brown'  noise
2819              in the right:
2820                 sox -n output.wav synth 3 sine 300-3300 brownnoise
2821              The  following  example  shows how two synth effects can be cas‐
2822              caded to create a more complex waveform:
2823                 play -n synth 0.5 sine 200-500 synth 0.5 sine fmod 700-100
2824              Frequencies can also be given in `scientific' note notation, or,
2825              by  prefixing a `%' character, as a number of semitones relative
2826              to `middle A' (440 Hz).  For example,  the  following  could  be
2827              used to help tune a guitar's low `E' string:
2828                 play -n synth 4 pluck %-29
2829              or with a (Bourne shell) loop, the whole guitar:
2830                 for n in E2 A2 D3 G3 B3 E4; do
2831                   play -n synth 4 pluck $n repeat 2; done
2832              See the delay effect (above) and the reference to `SoX scripting
2833              examples' (below) for more synth examples.
2834
2835              N.B.  This effect generates audio  at  maximum  volume  (0dBFS),
2836              which  means  that there is a high chance of clipping when using
2837              the audio subsequently, so in many cases, you will want to  fol‐
2838              low  this  effect with the gain effect to prevent this from hap‐
2839              pening. (See also Clipping above.)  Note that, by  default,  the
2840              synth  effect incorporates the functionality of gain -h (see the
2841              gain effect for details); synth's -n option may be given to dis‐
2842              able this behaviour.
2843
2844              A detailed description of each synth parameter follows:
2845
2846              len  is the length of audio to synthesise expressed as a time or
2847              as a number of samples; 0=inputlength, default=0.
2848
2849              The format for specifying lengths in time is hh:mm:ss.frac.  The
2850              format  for  specifying  sample  counts is the number of samples
2851              with the letter `s' appended to it.
2852
2853              type is one of sine, square, triangle, sawtooth, trapezium, exp,
2854              [white]noise,    tpdfnoise    pinknoise,    brownnoise,   pluck;
2855              default=sine.
2856
2857              combine is one of create, mix, amod (amplitude modulation), fmod
2858              (frequency modulation); default=create.
2859
2860              freq/freq2 are the frequencies at the beginning/end of synthesis
2861              in Hz  or,  if  preceded  with  `%',  semitones  relative  to  A
2862              (440 Hz);  alternatively,  `scientific'  note notation (e.g. E2)
2863              may be used.  The default frequency is 440Hz.  By  default,  the
2864              tuning  used with the note notations is `equal temperament'; the
2865              -j KEY option selects `just intonation', where KEY is an integer
2866              number  of  semitones  relative  to  A  (so for example, -9 or 3
2867              selects the key of C), or a note in scientific notation.
2868
2869              If freq2 is given, then len must also have been  given  and  the
2870              generated tone will be swept between the given frequencies.  The
2871              two given frequencies must be separated by one of the characters
2872              `:',  `+',  `/',  or `-'.  This character is used to specify the
2873              sweep function as follows:
2874
2875              :      Linear: the tone will change by a fixed number  of  hertz
2876                     per second.
2877
2878              +      Square:  a  second-order  function  is used to change the
2879                     tone.
2880
2881              /      Exponential: the tone will change by a  fixed  number  of
2882                     semitones per second.
2883
2884              -      Exponential:  as  `/', but initial phase always zero, and
2885                     stepped (less smooth) frequency changes.
2886
2887              Not used for noise.
2888
2889              off is the bias (DC-offset) of the signal in percent; default=0.
2890
2891              ph is the phase shift in percentage of 1 cycle; default=0.   Not
2892              used for noise.
2893
2894              p1  is  the  percentage  of each cycle that is `on' (square), or
2895              `rising' (triangle, exp, trapezium); default=50 (square,  trian‐
2896              gle,   exp),   default=10   (trapezium),   or  sustain  (pluck);
2897              default=40.
2898
2899              p2 (trapezium): the  percentage  through  each  cycle  at  which
2900              `falling' begins; default=50. exp: the amplitude in multiples of
2901              2dB; default=50, or tone-1 (pluck); default=20.
2902
2903              p3 (trapezium): the  percentage  through  each  cycle  at  which
2904              `falling' ends; default=60, or tone-2 (pluck); default=90.
2905
2906       tempo [-q] [-m|-s|-l] factor [segment [search [overlap]]]
2907              Change  the  audio playback speed but not its pitch. This effect
2908              uses the WSOLA algorithm. The audio is chopped up into  segments
2909              which are then shifted in the time domain and overlapped (cross-
2910              faded) at points where  their  waveforms  are  most  similar  as
2911              determined by measurement of `least squares'.
2912
2913              By  default,  linear searches are used to find the best overlap‐
2914              ping points.  If  the  optional  -q  parameter  is  given,  tree
2915              searches  are  used  instead.  This  makes  the effect work more
2916              quickly, but the result may not sound as good. However,  if  you
2917              must  improve  the  processing speed, this generally reduces the
2918              sound quality less than reducing the search or overlap values.
2919
2920              The -m option is used to optimize  default  values  of  segment,
2921              search and overlap for music processing.
2922
2923              The  -s  option  is  used to optimize default values of segment,
2924              search and overlap for speech processing.
2925
2926              The -l option is used to optimize  default  values  of  segment,
2927              search  and  overlap for `linear' processing that tends to cause
2928              more noticeable distortion but may  be  useful  when  factor  is
2929              close to 1.
2930
2931              If -m, -s, or -l is specified, the default value of segment will
2932              be calculated based on factor, while default search and  overlap
2933              values  are based on segment. Any values you provide still over‐
2934              ride these default values.
2935
2936              factor gives the ratio of new tempo to the old  tempo,  so  e.g.
2937              1.1 speeds up the tempo by 10%, and 0.9 slows it down by 10%.
2938
2939              The  optional  segment parameter selects the algorithm's segment
2940              size in milliseconds.  If no  other  flags  are  specified,  the
2941              default  value  is  82  and  is typically suited to making small
2942              changes to the tempo of music. For larger changes (e.g. a factor
2943              of 2), 41 ms may give a better result.  The -m, -s, and -l flags
2944              will cause the segment  default  to  be  automatically  adjusted
2945              based on factor.  For example using -s (for speech) with a tempo
2946              of 1.25 will calculate a default segment value of 32.
2947
2948              The optional search parameter gives the  audio  length  in  mil‐
2949              liseconds  over  which the algorithm will search for overlapping
2950              points.  If no other flags are specified, the default  value  is
2951              14.68.   Larger  values  use more processing time and may or may
2952              not produce better results.  A practical  maximum  is  half  the
2953              value  of  segment. Search can be reduced to cut processing time
2954              at the risk of degrading output quality.  The  -m,  -s,  and  -l
2955              flags will cause the search default to be automatically adjusted
2956              based on segment.
2957
2958              The optional overlap parameter gives the segment overlap  length
2959              in  milliseconds.   Default value is 12, but -m, -s, or -l flags
2960              automatically adjust overlap based on segment  size.  Increasing
2961              overlap  increases  processing  time and may increase quality. A
2962              practical maximum for overlap is the value of search, with over‐
2963              lap typically being (at least) a little smaller then search.
2964
2965              See  also  speed  for  an  effect  that  changes tempo and pitch
2966              together, pitch for an  effect  that  changes  tempo  and  pitch
2967              together,  and  stretch for an effect that changes tempo using a
2968              different algorithm.
2969
2970       treble gain [frequency[k] [width[s|h|k|o|q]]]
2971              Apply a treble tone-control effect.  See the description of  the
2972              bass effect for details.
2973
2974       tremolo speed [depth]
2975              Apply  a  tremolo (low frequency amplitude modulation) effect to
2976              the audio.  The tremolo frequency in Hz is given by  speed,  and
2977              the depth as a percentage by depth (default 40).
2978
2979       trim start [length|=end]
2980              Trim  can  trim off unwanted audio from the beginning and end of
2981              the audio.  Audio is not sent to the  output  stream  until  the
2982              start location is reached.
2983
2984              The  optional length parameter gives the length of audio to out‐
2985              put after the start sample and is thus used to trim off the  end
2986              of  the  audio.   Alternatively, an absolute end location can be
2987              given by preceding it with an equals sign.  Using a value  of  0
2988              for the start parameter will allow trimming off the end only.
2989
2990              Both  parameters can be specified using either an amount of time
2991              or an exact count of samples.  The format for specifying lengths
2992              in  time  is  hh:mm:ss.frac.   A  start value of 1:30.5 will not
2993              start until 1 minute, thirty and ½ seconds into the audio.   The
2994              format  for  specifying  sample  counts is the number of samples
2995              with the letter `s' appended to it.  A value of  8000s  for  the
2996              start  parameter  will  wait  until 8000 samples are read before
2997              starting to process audio.
2998
2999       vad [options]
3000              Voice Activity Detector.  Attempts to  trim  silence  and  quiet
3001              background  sounds from the ends of (fairly high resolution i.e.
3002              16-bit, 44-48kHz) recordings of speech.  The algorithm currently
3003              uses a simple cepstral power measurement to detect voice, so may
3004              be fooled by other things, especially  music.   The  effect  can
3005              trim  only from the front of the audio, so in order to trim from
3006              the back, the reverse effect must also be used.  E.g.
3007                 play speech.wav norm vad
3008              to trim from the front,
3009                 play speech.wav norm reverse vad reverse
3010              to trim from the back, and
3011                 play speech.wav norm vad reverse vad reverse
3012              to trim from both ends.  The use of the norm  effect  is  recom‐
3013              mended,  but  remember that neither reverse nor norm is suitable
3014              for use with streamed audio.
3015
3016              Options:
3017              Default values are shown in parenthesis.
3018
3019              -t num (7)
3020                     The measurement level used to trigger activity detection.
3021                     This  might  need  to  be  changed depending on the noise
3022                     level, signal level and other charactistics of the  input
3023                     audio.
3024
3025              -T num (0.25)
3026                     The  time constant (in seconds) used to help ignore short
3027                     bursts of sound.
3028
3029              -s num (1)
3030                     The amount of audio  (in  seconds)  to  search  for  qui‐
3031                     eter/shorter  bursts  of  audio  to  include prior to the
3032                     detected trigger point.
3033
3034              -g num (0.25)
3035                     Allowed gap (in seconds) between  quieter/shorter  bursts
3036                     of audio to include prior to the detected trigger point.
3037
3038              -p num (0)
3039                     The  amount  of audio (in seconds) to preserve before the
3040                     trigger point and any found quieter/shorter bursts.
3041
3042              Advanced Options:
3043              These allow fine tuning of the alogithm's internal parameters.
3044
3045              -b num The algorithm (internally) uses  adaptive  noise  estima‐
3046                     tion/reduction in order to detect the start of the wanted
3047                     audio.  This option sets the time for the  initial  noise
3048                     estimate.
3049
3050              -N num Time  constant  used  by the adaptive noise estimator for
3051                     when the noise level is increasing.
3052
3053              -n num Time constant used by the adaptive  noise  estimator  for
3054                     when the noise level is decreasing.
3055
3056              -r num Amount  of  noise reduction to use in the detection algo‐
3057                     rithm (e.g. 0, 0.5, ...).
3058
3059              -f num Frequency of the algorithm's processing/measurements.
3060
3061              -m num Measurement duration; by default, twice  the  measurement
3062                     period; i.e.  with overlap.
3063
3064              -M num Time constant used to smooth spectral measurements.
3065
3066              -h num `Brick-wall' frequency of high-pass filter applied at the
3067                     input to the detector algorithm.
3068
3069              -l num `Brick-wall' frequency of low-pass filter applied at  the
3070                     input to the detector algorithm.
3071
3072              -H num `Brick-wall'  quefrency  of  high-pass lifter used in the
3073                     detector algorithm.
3074
3075              -L num `Brick-wall' quefrency of low-pass  lifter  used  in  the
3076                     detector algorithm.
3077
3078              See also the silence effect.
3079
3080       vol gain [type [limitergain]]
3081              Apply  an  amplification  or an attenuation to the audio signal.
3082              Unlike the -v option (which is used for balancing multiple input
3083              files as they enter the SoX effects processing chain), vol is an
3084              effect like any other so can be applied  anywhere,  and  several
3085              times if necessary, during the processing chain.
3086
3087              The amount to change the volume is given by gain which is inter‐
3088              preted, according to the given type,  as  follows:  if  type  is
3089              amplitude (or is omitted), then gain is an amplitude (i.e. volt‐
3090              age or linear) ratio, if power, then a power  (i.e.  wattage  or
3091              voltage-squared) ratio, and if dB, then a power change in dB.
3092
3093              When  type  is amplitude or power, a gain of 1 leaves the volume
3094              unchanged,  less  than  1  decreases  it,  and  greater  than  1
3095              increases  it; a negative gain inverts the audio signal in addi‐
3096              tion to adjusting its volume.
3097
3098              When type is dB, a gain of 0 leaves the volume  unchanged,  less
3099              than 0 decreases it, and greater than 0 increases it.
3100
3101              See [4] for a detailed discussion on electrical (and hence audio
3102              signal) voltage and power ratios.
3103
3104              Beware of Clipping when the increasing the volume.
3105
3106              The gain and the type parameters can be concatenated if desired,
3107              e.g.  vol 10dB.
3108
3109              An  optional  limitergain value can be specified and should be a
3110              value much less than 1 (e.g. 0.05 or 0.02) and is used  only  on
3111              peaks  to  prevent clipping.  Not specifying this parameter will
3112              cause no limiter to be used.  In verbose mode, this effect  will
3113              display the percentage of the audio that needed to be limited.
3114
3115              See  also gain for a volume-changing effect with different capa‐
3116              bilities, and compand  for  a  dynamic-range  compression/expan‐
3117              sion/limiting effect.
3118
3119   Deprecated Effects
3120       The  following  effects  have  been renamed or have their functionality
3121       included in another effect; they continue to work in  this  version  of
3122       SoX but may be removed in future.
3123
3124       filter [low]-[high] [window-len [beta]]
3125              Apply  a  sinc-windowed lowpass, highpass, or bandpass filter of
3126              given window length to the signal.  This effect has been  super‐
3127              seded  by  the  sinc  effect.  Compared with `sinc', `filter' is
3128              slower and has fewer capabilities.
3129
3130              low refers to the frequency of the lower 6dB corner of the  fil‐
3131              ter.   high  refers  to the frequency of the upper 6dB corner of
3132              the filter.
3133
3134              A low-pass filter is obtained by leaving low unspecified, or  0.
3135              A  high-pass  filter is obtained by leaving high unspecified, or
3136              0, or greater than or equal to the Nyquist frequency.
3137
3138              The window-len, if unspecified, defaults to 128.  Longer windows
3139              give a sharper cut-off, smaller windows a more gradual cut-off.
3140
3141              The  beta  parameter  determines the type of filter window used.
3142              Any value greater than 2 is the beta for a Kaiser window.   Beta
3143              ≤  2  selects  a  Blackman-Nuttall  window.  If unspecified, the
3144              default is a Kaiser window with beta 16.
3145
3146              In the case of Kaiser window (beta > 2), lower betas  produce  a
3147              somewhat  faster  transition from pass-band to stop-band, at the
3148              cost of noticeable artifacts. A beta of 16 is the default,  beta
3149              less  than 10 is not recommended. If you want a sharper cut-off,
3150              don't use low beta's, use a longer sample  window.  A  Blackman-
3151              Nuttall window is selected by specifying any `beta' ≤ 2, and the
3152              Blackman-Nuttall window has somewhat steeper  cut-off  than  the
3153              default  Kaiser  window.  You  will probably not need to use the
3154              beta parameter at all, unless you are just curious about compar‐
3155              ing the effects of Blackman-Nuttall vs. Kaiser windows.
3156
3157              This effect supports the --plot global option.
3158
3159       key [-q] shift [segment [search [overlap]]]
3160              Change  the  audio key (i.e. pitch but not tempo).  This is just
3161              an alias for the pitch effect.
3162
3163       pan direction
3164              Mix the audio from one channel to another.  Use mixer  or  remix
3165              instead of this effect.
3166
3167              The  direction  is a value from -1 to 1.  -1 represents far left
3168              and 1 represents far right.
3169
3170       polyphase [-w nut|ham] [-width n] [-cut-off c]
3171       rabbit [-c0|-c1|-c2|-c3|-c4]
3172       resample [-qs|-q|-ql] [rolloff [beta]]
3173              Formerly sample-rate-changing effects in their own right,  these
3174              are now just aliases for the rate effect.
3175

DIAGNOSTICS

3177       Exit  status  is  0 for no error, 1 if there is a problem with the com‐
3178       mand-line parameters, or 2 if an error occurs during file processing.
3179

BUGS

3181       Please report any bugs found in this version of SoX to the mailing list
3182       (sox-users@lists.sourceforge.net).
3183

SEE ALSO

3185       soxi(1), soxformat(7), libsox(3)
3186       audacity(1), gnuplot(1), octave(1), wget(1)
3187       The SoX web site at http://sox.sourceforge.net
3188       SoX scripting examples at http://sox.sourceforge.net/Docs/Scripts
3189
3190   References
3191       [1]    R. Bristow-Johnson, Cookbook formulae for audio EQ biquad filter
3192              coefficients, http://musicdsp.org/files/Audio-EQ-Cookbook.txt
3193
3194       [2]    Wikipedia, Q-factor, http://en.wikipedia.org/wiki/Q_factor
3195
3196       [3]    Scott    Lehman,    Effects    Explained,    http://harmony-cen
3197              tral.com/Effects/effects-explained.html
3198
3199       [4]    Wikipedia, Decibel, http://en.wikipedia.org/wiki/Decibel
3200
3201       [5]    Richard  Furse,  Linux  Audio  Developer's  Simple  Plugin  API,
3202              http://www.ladspa.org
3203
3204       [6]    Richard Furse, Computer Music Toolkit, http://www.ladspa.org/cmt
3205
3206       [7]    Steve Harris, LADSPA plugins, http://plugin.org.uk
3207

LICENSE

3209       Copyright 1998-2011 Chris Bagwell and SoX Contributors.
3210       Copyright 1991 Lance Norskog and Sundry Contributors.
3211
3212       This program is free software; you can redistribute it and/or modify it
3213       under  the  terms of the GNU General Public License as published by the
3214       Free Software Foundation; either version 2, or  (at  your  option)  any
3215       later version.
3216
3217       This  program  is  distributed  in the hope that it will be useful, but
3218       WITHOUT ANY  WARRANTY;  without  even  the  implied  warranty  of  MER‐
3219       CHANTABILITY  or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General
3220       Public License for more details.
3221

AUTHORS

3223       Chris Bagwell (cbagwell@users.sourceforge.net).  Other authors and con‐
3224       tributors are listed in the ChangeLog file that is distributed with the
3225       source code.
3226
3227
3228
3229sox                            February 19, 2011                        SoX(1)
Impressum