play(1) - f7

1SoX(1)                          Sound eXchange                          SoX(1)
2
3
4

NAME

6       SoX - Sound eXchange - The Swiss Army knife of audio manipulation
7

SYNOPSIS

9       sox [global-options] [format-options] infile1
10           [[format-options] infile2] ... [format-options] outfile
11           [effect [effect-options]] ...
12
13       play [global-options] [format-options] infile1
14           [[format-options] infile2] ... [format-options]
15           [effect [effect-options]] ...
16
17       rec [global-options] [format-options] outfile
18           [effect [effect-options]] ...
19

DESCRIPTION

21       SoX  reads  and  writes  audio  files  in  most popular formats and can
22       optionally apply  effects  to  them;  it  can  combine  multiple  input
23       sources,  synthesise audio, and, on many systems, act as a general pur‐
24       pose audio player or a multi-track audio recorder.
25
26       The entire SoX functionality is available using just the `sox' command,
27       however,  to simplify playing and recording audio, if SoX is invoked as
28       `play', the output file is automatically set to be  the  default  sound
29       device  and if invoked as `rec', the default sound device is used as an
30       input source.
31
32       The heart of SoX is a library called `Sound Tools'.   Those  interested
33       in  extending  SoX  or  using  it in other programs should refer to the
34       Sound Tools manual page: libst(3).
35
36       The overall SoX processing chain can be summarised as follows:
37
38                 Input(s) → Balancing → Combiner → Effects → Output
39
40       To show how this works in practise, here are some examples of  how  SoX
41       might be used.  The simple:
42
43            sox recital.au recital.wav
44
45       translates  an  audio  file  in  Sun AU format to a Microsoft WAV file,
46       whilst:
47
48            sox recital.au -r 12000 -b -c 1 recital.wav vol 0.7 dither
49
50       performs the same format translation, but also changes the  audio  sam‐
51       pling  rate  & sample size, down-mixes to mono, and applies the vol and
52       dither effects.
53
54            sox -r 8000 -u -b -c 1 voice-memo.raw voice-memo.wav
55
56       adds a header to a raw audio file,
57
58            sox slow.aiff fixed.aiff speed 1.027 rabbit -c0
59
60       adjusts audio speed using the most accurate rabbit algorithm,
61
62            sox short.au long.au longer.au
63
64       concatenates two audio files, and
65
66            sox -m music.mp3 voice.wav mixed.flac
67
68       mixes together two audio files.
69
70            play "The Moonbeams/Greatest/*.ogg" bass +3
71
72       plays a collection of audio  files  whilst  applying  a  bass  boosting
73       effect,
74
75            play  -c  4 -n -c 1 synth sin %-12 sin %-9 sin %-5 sin %-2 vol 0.7
76       mixer fade q 0.1 1 0.1
77
78       plays a synthesised `A minor seventh' chord with a pipe-organ sound,
79
80            rec -c 2 test.aiff trim 0 10
81
82       records 10 seconds of stereo audio, and
83
84            rec -M take1.aiff take1-dub.aiff
85
86       records a new track in a multi-track recording.
87
88       Further examples are included  throughout  this  manual;  more-detailed
89       examples can be found in the separate soxexam(7) manual.
90
91   File Formats
92       There  are  two types of audio file format that SoX can work with.  The
93       first is `self-describing'; these formats include a  header  that  com‐
94       pletely  describes  the characteristics of the audio data that follows.
95       The second type is `headerless' (or `raw data'); here, the  audio  data
96       characteristics must be described using the SoX command line.
97
98       The  following four characteristics are sufficient to describe the for‐
99       mat of audio data such that it can be processed with SoX:
100
101       sample rate
102              The sample rate in samples per second (`Hertz'  or  `Hz').   For
103              example,  digital  telephony traditionally uses a sample rate of
104              8000 Hz (8 kHz); audio Compact Discs use 44100 Hz (44.1 kHz).
105
106       sample size
107              The number of bits used to store each sample. Most  popular  are
108              8-bit  (one byte) and 16-bit (two bytes). (Since many now-common
109              sound formats were invented when most computers  used  a  16-bit
110              word, two bytes is often called a `word', but since current per‐
111              sonal computers overwhelmingly have 32-bit or 64-bit words, this
112              usage is confusing, and is not used in the SoX documentation.)
113
114       data encoding
115              The   way   in  which  each  audio  sample  is  represented  (or
116              `encoded').  Some encodings have variants with  different  byte-
117              orderings or bit-orderings; some `compress' the audio data, i.e.
118              the stored audio data takes up less space  (i.e.  disk-space  or
119              transmission  band-width)  than  the other format parameters and
120              the number of samples would imply.  Commonly-used encoding types
121              include floating-point, μ-law, ADPCM, signed linear, and FLAC.
122
123       channels
124              The  number  of  audio  channels  contained  in  the  file.  One
125              (`mono') and two (`stereo') are widely used.
126
127       The term `bit-rate' is sometimes used as an overall measure of an audio
128       format and may incorporate elements of all of the above.
129
130       Most self-describing formats also allow textual `comments' to be embed‐
131       ded in the file that can be used to describe the  audio  in  some  way,
132       e.g. for music, the title, the author, etc.
133
134       One  important  use  of  audio file comments is to convey `Replay Gain'
135       information.  SoX supports applying Replay Gain  information,  but  not
136       generating it.  Note that by default, SoX copies input file comments to
137       output files that support comments, so output files may contain  Replay
138       Gain  information if some was present in the input file.  In this case,
139       if anything other than a simple format conversion  was  performed  then
140       the  output  file Replay Gain information is likely to be incorrect and
141       so should be recalculated using a tool that supports this (not SoX).
142
143   Determining & Setting The File Format
144       There are several mechanisms available for SoX to use to  determine  or
145       set the format characteristics of an audio file.  Depending on the cir‐
146       cumstances, individual characteristics may be determined or  set  using
147       different mechanisms.
148
149       To  determine  the  format  of an input file, SoX will use, in order of
150       precedence and as given or available:
151
152
153           1.   Command-line format options.
154           2.   The contents of the file header.
155           3.   The filename extension.
156
157       To set the output file format, SoX will use, in order of precedence and
158       as given or available:
159
160
161           1.   Command-line format options.
162           2.   The filename extension.
163           3.   The  input  file  format  characteristics, or the closest to
164                them that is supported by the output file type.
165
166       For all files, SoX will exit with an error if the file type  cannot  be
167       determined; command-line format options may need to be added or changed
168       to resolve the problem.
169
170   Accuracy
171       Many file formats that compress audio discard some of the audio  signal
172       information  whilst doing so; converting to such a format then convert‐
173       ing back again will not produce an exact copy of  the  original  audio.
174       This  is the case for many formats used in telephony (e.g.  A-law, GSM)
175       where low signal bandwidth is more important than high audio  fidelity,
176       and  for many formats used in portable music players (e.g. MP3, Vorbis)
177       where adequate fidelity can be retained even with the large compression
178       ratios that are needed to make portable players practical.
179
180       Formats  that  discard audio signal information are called `lossy', and
181       formats that do not, `lossless'.  The term `quality' is used as a  mea‐
182       sure  of  how  closely the original audio signal can be reproduced when
183       using a lossy format.
184
185       Audio file conversion with SoX is lossless when it can  be,  i.e.  when
186       not  using  lossy  compression,  when not reducing the sampling rate or
187       number of channels, and when the number of bits used in the destination
188       format is not less than in the source format.  E.g.  converting from an
189       8-bit PCM format to a 16-bit PCM format is lossless but converting from
190       an 8-bit PCM format to (8-bit) A-law isn't.
191
192       N.B.   SoX  converts all audio files to an internal uncompressed format
193       before performing any audio processing; this means that manipulating  a
194       file that is stored in a lossy format can cause further losses in audio
195       fidelity.  E.g. with
196
197             sox long.mp3 short.mp3 trim 10
198
199       SoX first decompresses the  input  MP3  file,  then  applies  the  trim
200       effect,  and  finally  creates the output MP3 file by recompressing the
201       audio - with a possible reduction in fidelity above that which occurred
202       when  the input file was created.  Hence, if what is ultimately desired
203       is lossily compressed audio, it is highly recommended  to  perform  all
204       audio  processing  using  lossless file formats and then convert to the
205       lossy format at the final stage.
206
207       N.B.  Applying multiple effects with a single SoX invocation  will,  in
208       general, produce more accurate results than those produced using multi‐
209       ple SoX invocations; hence this is also recommended.
210
211   Clipping
212       Clipping is distortion that occurs when an audio signal level (or `vol‐
213       ume')  exceeds  the  range  of the chosen representation.  It is nearly
214       always undesirable and so should usually be corrected by adjusting  the
215       volume prior to the point at which clipping occurs.
216
217       In  SoX,  clipping could occur, as you might expect, when using the vol
218       effect to increase the audio volume, but could  also  occur  with  many
219       other  effects,  when  converting  one format to another, and even when
220       simply playing the audio.
221
222       Playing an audio file often involves  re-sampling,  and  processing  by
223       analogue  components that can introduce a small DC offset and/or ampli‐
224       fication, all of which can produce distortion if the audio signal level
225       was initially too close to the clipping point.
226
227       For these reasons, it is usual to make sure that an audio file's signal
228       level does not exceed around 70% of the maximum (linear)  range  avail‐
229       able, as this will avoid the majority of clipping problems.  SoX's stat
230       effect can assist in determining the signal level in an audio file; the
231       vol effect can be used to prevent clipping, e.g.
232
233             sox dull.au bright.au vol -6 dB treble +6
234
235       guarantees that the treble boost will not clip.
236
237       If  clipping  occurs at any point during processing, then SoX will dis‐
238       play a warning message to that effect.
239
240   Input File Combining
241       SoX's input combiner can combine multiple files using one of four  dif‐
242       ferent  methods:  `concatenate',  `sequence',  `mix',  or `merge'.  The
243       default method is `sequence' for play, and `concatenate'  for  rec  and
244       sox.
245
246       For  all  methods other than `sequence', multiple input files must have
247       the same sampling rate; if necessary, separate SoX invocations  can  be
248       used to make sampling rate adjustments prior to combining.
249
250       If  the  `concatenate' combining method is selected (usually, this will
251       be by default) then the input files must also have the same  number  of
252       channels.   The audio from each input will be concatenated in the order
253       given to form the output file.
254
255       The `sequence' combining method is selected automatically for play.  It
256       is  similar  to `concatenate' in that the audio from each input file is
257       sent serially to the output file, however here the output file  may  be
258       closed and reopened at the corresponding transition between input files
259       - this may be just what is needed  when  sending  audio  to  an  output
260       device,  but  is  not generally useful when the output file is a normal
261       file.
262
263       If the `mix' combining method is selected (with -m) then  two  or  more
264       input files must be given and will be mixed together to form the output
265       file.  The number of channels in each input file need not be the  same,
266       however,  SoX will issue a warning if they are not and some channels in
267       the output file will not contain audio from every input file.  A  mixed
268       audio file cannot be un-mixed.
269
270       If the `merge' combining method is selected (with -M), then two or more
271       input files must be given and will be merged together to form the  out‐
272       put  file.   The  number of channels in each input file need not be the
273       same.  A merged audio file comprises all of the channels  from  all  of
274       the  input  files; un-merging is possible using multiple invocations of
275       SoX with the mixer effect.  For example, two mono files could be merged
276       to  form  one stereo file; the first and second mono files would become
277       the left and right channels of the stereo file.
278
279       When combining input files, SoX applies any specified effects  (includ‐
280       ing, for example, the vol volume adjustment effect) after the audio has
281       been combined; however, it is often useful to be able to set the volume
282       of  (i.e.  `balance')  the  inputs individually, before combining takes
283       place.
284
285       For all combining methods, input file volume adjustments  can  be  made
286       manually using the -v option (below) which can be given for one or more
287       input files; if it is given for only some of the input files  then  the
288       others  receive no volume adjustment.  In some circumstances, automatic
289       volume adjustments may be applied (see below).
290
291       The -V option (below) can be used to show the input file volume adjust‐
292       ments that have been selected (either manually or automatically).
293
294       There  are  some  special  considerations that need to made when mixing
295       input files:
296
297       Unlike the other methods, `mix' combining has the  potential  to  cause
298       clipping  in  the  combiner  if no balancing is performed.  So here, if
299       manual volume adjustments are not given, to ensure that  clipping  does
300       not occur, SoX will automatically adjust the volume (amplitude) of each
301       input signal by a factor of ¹/n, where n is the number of input  files.
302       If this results in audio that is too quiet or otherwise unbalanced then
303       the input file volumes should be set manually as described above.
304
305       If mixed audio seems loud enough at some points through the  audio  but
306       too  quiet  in others, then dynamic-range compression should be applied
307       to correct this - see the compand effect.
308
309   Stopping SoX
310       Usually SoX will complete its processing and exit  automatically,  how‐
311       ever  if  desired, it can be terminated by pressing the keyboard inter‐
312       rupt key (usually Ctrl-C).  This is a natural requirement in some  cir‐
313       cumstances,  e.g.  when  using SoX to make a recording.  Note that when
314       using SoX to play multiple files, Ctrl-C behaves slightly  differently:
315       pressing it once causes SoX to skip to the next file; pressing it twice
316       in quick succession causes SoX to exit.
317

FILENAMES

319       The following `special' filenames may be used in certain  circumstances
320       in place of a normal filename on the command line:
321
322       -      SoX  can  be  used  in  pipeline operations by using the special
323              filename `-' which, if used in place of an input filename,  will
324              cause  SoX  will  read audio data from `standard input' (stdin),
325              and which, if used in place of the output filename,  will  cause
326              SoX  will  send  audio data to `standard output' (stdout).  Note
327              that when using this option, the file-type (see -t  below)  must
328              also be given.
329
330       -n     This  can  be  used  in  place of an input or output filename to
331              specify that a `null file' is to be used.  Note that here, `null
332              file'  refers  to a SoX-specific mechanism and is not related to
333              any operating-system mechanism with a similar name.
334
335              Using a null file to input audio is equivalent to using a normal
336              audio  file  that contains an infinite amount of silence, and as
337              such is not generally useful unless used  with  an  effect  that
338              specifies a finite time length (such as trim or synth).
339
340              Using  a  null  file  to  output audio amounts to discarding the
341              audio and is useful mainly with effects that produce information
342              about  the  audio  instead of affecting it (such as noiseprof or
343              stat).
344
345              The number of channels and the sampling rate associated  with  a
346              null  file  are  by default 2 and 44.1 kHz respectively, but, as
347              with a normal file, these can be  overridden  if  desired  using
348              command-line format options (see below).
349
350              One  other use of -n is to use it in conjunction with -V to dis‐
351              play information from the audio file header  without  having  to
352              read  any further into the file, e.g.  sox -V *.wav -n will dis‐
353              play header information for  each  `WAV'  file  in  the  current
354              directory.
355
356       -e     This is an alias of -n and is retained for backwards compatibil‐
357              ity only.
358
359       N.B.  Giving SoX an input or output filename that is the same as a  SoX
360       effect-name will not work since SoX will treat it as an effect specifi‐
361       cation.  The only work-around to this is to avoid such filenames;  how‐
362       ever, this is generally not difficult since most audio filenames have a
363       filename `extension', whilst effect-names do not.
364

OPTIONS

366   Global Options
367       These options can be specified on the command line at any point  before
368       the first effect name.
369
370       -h, --help
371              Show version number and usage information.
372
373       --help-effect=name
374              Show  usage  information  on the specified effect.  The name all
375              can be used to show usage on all effects.
376
377       --interactive
378              Prompt before overwriting an existing file with the same name as
379              that given for the output file.
380
381              N.B.   Unintentionally  overwriting  a  file  is easier than you
382              might think, for example, if you accidentally enter
383
384                    sox file1 file2 effect1 effect2 ...
385
386              when what you really meant was
387
388                    play file1 file2 effect1 effect2 ...
389
390              then, without this option, file2 will  be  overwritten.   Hence,
391              using  this  option  is  strongly  recommended; a `shell' alias,
392              script, or batch file may be an appropriate way  of  permanently
393              enabling it.
394
395       -m|-M|--combine=concatenate|merge|mix|sequence
396              Select  the  input  file  combining method; -m selects `mix', -M
397              selects `merge',
398
399              See Input File Combining above for a description of the  differ‐
400              ent combining methods.
401
402       --octave
403              Run  in  a  mode  that  can be used, in conjunction with the GNU
404              Octave program, to assist with the selection  and  configuration
405              of  many  of  the filtering effects.  For the first given effect
406              that supports the --octave option, SoX will output  Octave  com‐
407              mands  to  plot  the  effect's  transfer function, and then exit
408              without actually processing any audio.  E.g.
409
410                    sox --octave input-file -n highpass 1320 > plot.m
411                    octave plot.m
412
413       -q, --no-show-progress
414              Run in quiet mode when SoX wouldn't otherwise do so; this is the
415              opposite of the -S option.
416
417       --replay-gain=track
418       --replay-gain=album
419       --replay-gain=off
420              Select  whether  or not to apply replay-gain adjustment to input
421              files.  The default is track for play and off otherwise.
422
423       -S, --show-progress
424              Display input file format/header information and  input  file(s)
425              processing  progress in terms of elapsed/remaining time and per‐
426              centage complete.  This option is enabled by default when  using
427              SoX to play or record audio.
428
429       --version
430              Show version number and exit.
431
432       -V[level]
433              Set  verbosity.   SoX  prints  messages  to the console (stderr)
434              according to the following verbosity levels:
435
436              0      No messages are printed at all; use the  exit  status  to
437                     determine if an error has occurred.
438
439              1      Only  error messages are printed.  These are generated if
440                     SoX cannot complete the requested commands.
441
442              2      Warning messages are also printed.  These  are  generated
443                     if  SoX  can  complete  the  requested  commands, but not
444                     exactly according to the requested command parameters, or
445                     if clipping occurs.
446
447              3      Descriptions of SoX's processing phases are also printed.
448                     Useful for seeing exactly how SoX is mangling your audio.
449
450              4 and above
451                     Messages to help with debugging SoX are also printed.
452
453              By default, the verbosity level is set to 2.  Each occurrence of
454              the  -V  option  increases  the  verbosity level by 1.  Alterna‐
455              tively, the verbosity level can be set to an absolute number  by
456              specifying it immediately after the -V e.g.  -V0 sets it to 0.
457
458   Input File Options
459       These  options  apply  only  to  input files and may precede only input
460       filenames on the command line.
461
462       -v volume, --volume=volume
463              Adjust volume by a factor of volume.  This is a  linear  (ampli‐
464              tude)  adjustment, so a number less than 1 decreases the volume;
465              greater than 1 increases it.  If a  negative  number  is  given,
466              then in addition to the volume adjustment, the audio signal will
467              be inverted.
468
469              See also the stat effect for information on how to find the max‐
470              imum  volume  of  an audio file; this can be used to help select
471              suitable values for this option.
472
473              See also Input File Balancing above.
474
475   Input & Output File Format Options
476       These options apply to the input or output file whose name they immedi‐
477       ately precede on the command line and are used mainly when working with
478       headerless file formats or when specifying a format for the output file
479       that is different to that of the input file.
480
481       -c channels, --channels=channels
482              The  number of audio channels in the audio file.  This may be 1,
483              2, or 4; for mono, stereo, or quad audio.  To cause  the  output
484              file to have a different number of channels than the input file,
485              include this option with the output file options.  If the  input
486              and  output  file  have  a different number of channels then the
487              mixer effect must be used.  If the mixer effect is not specified
488              on  the  command line it will be invoked internally with default
489              parameters.
490
491       --comment text
492              Specify the comment text to store  in  the  output  file  header
493              (where applicable).
494
495              SoX  will  provide  a  default comment if this option (or --com‐
496              ment-file) is not given; to specify that no  comment  should  be
497              stored in the output file, use --comment "" or --comment=.
498
499       --comment-file filename
500              Specify  a file containing the comment text to store in the out‐
501              put file header (where applicable).
502
503       -r rate, --rate=rate
504              Gives the sample rate in Hz of the file.  To  cause  the  output
505              file  to  have  a  different  sample  rate  than the input file,
506              include this option with the output file format options.
507
508              If the input and output files have different rates then a sample
509              rate  change  effect  must  be run.  Since SoX has multiple rate
510              changing effects, the user  can  specify  which  to  use  as  an
511              effect.   If  no  rate change effect is specified then a default
512              one will be chosen.
513
514       -t file-type, --type=file-type
515              Gives the type of the audio file.  This is useful when the  file
516              extension is non-standard or when the type can not be determined
517              by looking at the header of the file.
518
519              The -t option can also be used to override the type  implied  by
520              an  input filename extension, but if overriding with a type that
521              has a header, SoX will exit with an appropriate error message if
522              such a header is not actually present.
523
524              See FILE TYPES below for a list of supported file types.
525
526       -L, --endian=little
527       -B, --endian=big
528       -x, --endian=swap
529              These  options  specify whether the byte-order of the audio data
530              is, respectively, `little endian', `big endian', or the opposite
531              to  that  of  the system on which SoX is being used.  Endianness
532              applies only to data encoded as signed or unsigned  integers  of
533              16  or more bits.  It is often necessary to specify one of these
534              options for headerless files, and sometimes necessary for  (oth‐
535              erwise)  self-describing  files.   A given endian-setting option
536              may be ignored for an input file whose header  contains  a  spe‐
537              cific endianness identifier, or for an output file that is actu‐
538              ally an audio device.
539
540              N.B.   Unlike  normal  format  characteristics,  the  endianness
541              (byte, nibble, & bit ordering) of the input file is not automat‐
542              ically used for the output file; so, for example, when the  fol‐
543              lowing is run on a little-endian system:
544
545                    sox -B audio.uw trimmed.uw trim 2
546
547              trimmed.uw will be created as little-endian;
548
549                    sox -B audio.uw -B trimmed.uw trim 2
550
551              must be used to preserve big-endianness in the output file.
552
553              The -V option can be used to check the selected orderings.
554
555       -N, --reverse-nibbles
556              Specifies that the nibble ordering (i.e. the 2 halves of a byte)
557              of the samples should be reversed; sometimes useful with  ADPCM-
558              based formats.
559
560              N.B.  See also N.B. in section on -x above.
561
562       -X, --reverse-bits
563              Specifies  that  the  bit  ordering  of  the  samples  should be
564              reversed; sometimes useful with a few (mostly  headerless)  for‐
565              mats.
566
567              N.B.  See also N.B. in section on -x above.
568
569       -s/-u/-U/-A/-a/-i/-g/-f
570              The  audio  data  encoding  is  signed  linear (2's complement),
571              unsigned  linear,  μ-law  (logarithmic),  A-law   (logarithmic),
572              ADPCM, IMA-ADPCM, GSM, or floating-point.
573
574              μ-law (or mu-law) and A-law are the U.S. and international stan‐
575              dards for logarithmic telephone audio compression.  When  uncom‐
576              pressed  μ-law has roughly the precision of 14-bit PCM audio and
577              A-law has roughly the precision of 13-bit PCM audio.
578
579              A-law and μ-law are sometimes encoded using reversed  bit-order‐
580              ing  (i.e. MSB becomes LSB).  Internally, SoX understands how to
581              work with these encodings but there is currently no command line
582              option  to  specify them.  If you need this support then you can
583              use the pseudo file types of `.la' and `.lu' to  inform  SoX  of
584              the encoding.  See supported file types for more information.
585
586              ADPCM  is a form of audio compression that has a good compromise
587              between good audio quality and fast encoding/decoding time.   It
588              is  used  for  telephone  audio compression and places were full
589              fidelity is not as important.  When uncompressed it has  roughly
590              the  precision  of  16-bit  PCM audio.  Popular version of ADPCM
591              include G.726, MS ADPCM, and IMA ADPCM.  The -a flag has differ‐
592              ent  meanings in different file handlers.  In .wav files it rep‐
593              resents MS ADPCM files, in all others it means G.726 ADPCM.  IMA
594              ADPCM  is a specific form of ADPCM compression, slightly simpler
595              and slightly lower fidelity than Microsoft's  flavor  of  ADPCM.
596              IMA ADPCM is also called DVI ADPCM.
597
598              GSM is currently used for the vast majority of the world's digi‐
599              tal wireless telephone calls.  It utilises several audio formats
600              with different bit-rates and associated speech quality.  SoX has
601              support for GSM's original 13kbps `Full Rate' audio format.   It
602              is usually CPU intensive to work with GSM audio.
603
604       -1/-2/-3/-4/-8
605              The sample datum size is 1, 2, 3, 4, or 8 bytes; i.e. 8, 16, 24,
606              32, or 64 bits.
607
608       The flags
609              -b/-w/-l/-d which are respectively aliases for -1/-2/-4/-8,  and
610              abbreviate  byte, word, long word, double long (long long) word,
611              are retained for backwards compatibility only.
612
613   Output File Format Options
614       These options apply only to the output file and may  precede  only  the
615       output filename on the command line.
616
617       -C compression-factor, --compression=compression-factor
618              The compression factor for variably compressing output file for‐
619              mats.  If this option is not given, then a  default  compression
620              factor  will  apply.  The compression factor is interpreted dif‐
621              ferently  for  different  compressing  file  formats.   See  the
622              description  of  the  file formats that use this option for more
623              information.
624

FILE TYPES

626       File types can be set by the filename extension or the -t  option  (see
627       above).  File  types that can be determined by a filename extension are
628       listed with their names preceded by a  dot.  File  types  that  require
629       optional  libsndfile support are marked `(libsndfile)'. File types that
630       can be handled by libsndfile using -t sndfile are marked `(also with -t
631       sndfile)'.   This  might be useful if you have a file that doesn't work
632       with SoX's default format readers and writers, and there's a libsndfile
633       reader and writer for that format.
634
635
636       .raw (also with -t sndfile)
637              Raw (headerless) audio files.  The sample rate, sample size, and
638              data encoding must be given using command-line  format  options;
639              the number of channels defaults to 1.
640
641       .ub, .sb, .uw, .sw, .ul, .al, .lu, .la, .sl (also with -t sndfile)
642              These filename extensions serve as shorthand for identifying the
643              format of headerless audio files.  Thus, ub, sb, uw, sw, ul, al,
644              lu,  la and sl indicate a file with a single audio channel, sam‐
645              ple rate of 8000 Hz, and samples  encoded  as  `unsigned  byte',
646              `signed  byte',  `unsigned word', `signed word', `μ-law' (byte),
647              `A-law' (byte), inverse bit order `μ-law', inverse bit order `A-
648              law',   or  `signed  long'  respectively.   Command-line  format
649              options can also be given to modify the selected  format  if  it
650              does not provide an exact match for a particular file.
651
652              Headerless  audio  files on a SPARC computer are likely to be of
653              format ul;  on a Mac, they're likely to be ub but with a  sample
654              rate of 11025 or 22050 Hz.
655
656       .8svx (also with -t sndfile)
657              Amiga 8SVX musical instrument description format.
658
659       .aiff, .aif (also with -t sndfile)
660              AIFF  files used on Apple IIc/IIgs and SGI.  Note: the AIFF for‐
661              mat supports only one SSND chunk.  It does not support  multiple
662              audio chunks, or the 8SVX musical instrument description format.
663              AIFF files are multimedia archives and can have  multiple  audio
664              and  picture  chunks.   You may need a separate archiver to work
665              with them.
666
667       .aiffc, .aifc (also with -t sndfile)
668              AIFF-C (not compressed, linear), defined in  DAVIC  1.4  Part  9
669              Annex  B.   This  format is referred from ARIB STD-B24, which is
670              specified for Japanese data broadcasting.   Any  private  chunks
671              are not supported.
672
673              Note: The input file is currently processed as .aiff.
674
675       alsa   ALSA  default device driver.  This is a pseudo-file type and can
676              be optionally compiled into SoX.  Run sox -h to see if you  have
677              support  for this file type.  When this driver is used it allows
678              you to open up a ALSA device and configure it to  use  the  same
679              data  format as passed in to SoX.  It works for both playing and
680              recording audio files.  When playing audio files it attempts  to
681              set up the ALSA driver to use the same format as the input file.
682              It is suggested to always override the output values to use  the
683              highest  quality  format  your ALSA system can handle.  Example:
684              sox infile -t alsa default
685
686       .au, .snd (also with -t sndfile)
687              Sun Microsystems AU files.  There are many types of AU file; DEC
688              has  invented  its  own  with  a different magic number and byte
689              order.  SoX can read these files but will not write them.   Some
690              .au files are known to have invalid AU headers; these are proba‐
691              bly original Sun μ-law 8000 Hz files and can be dealt with using
692              the .ul format (see below).
693
694              It  is  possible to override AU file header information with the
695              -r and -c options, in which case SoX will  issue  a  warning  to
696              that effect.
697
698       auto   This  format  type name exists for backwards compatibility only.
699              If given for an input file it will be silently ignored, if given
700              for an output file it will cause SoX to exit with an error.
701
702       .avr   Audio  Visual  Research.  The AVR format is produced by a number
703              of commercial packages on the Mac.
704
705       .caf (libsndfile)
706              Core Audio File format.
707
708       .cdda, .cdr
709              `Red Book' Compact Disc Digital Audio.  CDDA has two audio chan‐
710              nels  formatted  as  16-bit  signed integers at a sample rate of
711              44.1 kHz.  The number of (stereo) samples in each CDDA track  is
712              always a multiple of 588 which is why it needs its own handler.
713
714       .cvsd, .cvs
715              Continuously Variable Slope Delta modulation.  A headerless for‐
716              mat used to compress speech audio for applications such as voice
717              mail.   This format is sometimes used with bit-reversed samples‐
718              the -X format option can be used to set the bit-order.
719
720       .dat   Text Data files.  These files contain a  textual  representation
721              of  the  sample  data.   There is one line at the beginning that
722              contains the sample rate.  Subsequent lines contain two  numeric
723              data items: the time since the beginning of the first sample and
724              the sample value.  Values are normalized so that the maximum and
725              minimum  are  1  and -1.  This file format can be used to create
726              data files for external programs such as FFT analysers or  graph
727              routines.   SoX can also convert a file in this format back into
728              one of the other file formats.
729
730       .dvms, .vms
731              Used to compress speech audio for  applications  such  as  voice
732              mail.  A self-describing variant of cvsd.
733
734       .fap (libsndfile)
735              See .paf.
736
737       .flac (also with -t sndfile)
738              Free  Lossless  Audio  CODEC compressed audio.  FLAC is an open,
739              patent-free CODEC designed for compressing music.  It is similar
740              to  MP3 and Ogg Vorbis, but lossless, meaning that audio is com‐
741              pressed in FLAC without any loss in quality.
742
743              SoX can decode native FLAC files (.flac) but not Ogg FLAC  files
744              (.ogg).  [But see .ogg below for information relating to support
745              for Ogg Vorbis files.]
746
747              SoX has basic support for writing FLAC files: it can  encode  to
748              native  FLAC  using compression levels 0 to 8.  8 is the default
749              compression level and gives the best (but slowest)  compression;
750              0  gives  the  least (but fastest) compression.  The compression
751              level can be selected using the -C option  (see  above)  with  a
752              whole number from 0 to 8.
753
754              FLAC  support  in  SoX  is  optional  and requires optional FLAC
755              libraries.  To see if there is support for FLAC run sox  -h  and
756              look for it under the list of supported file formats as `flac'.
757
758       .fssd  An alias for the .ub format.
759
760       .gsm (also with -t sndfile)
761              GSM  06.10  Lossy  Speech  Compression.  A lossy format for com‐
762              pressing speech which is used in the Global Standard for  Mobile
763              telecommunications  (GSM).  It's good for its purpose, shrinking
764              audio data size, but it will introduce  lots  of  noise  when  a
765              given  audio signal is encoded and decoded multiple times.  This
766              format is used by some voice mail applications.   It  is  rather
767              CPU intensive.
768
769              GSM  in  SoX  is optional and requires access to an external GSM
770              library.  To see if there is support for GSM run sox -h and look
771              for it under the list of supported file formats.
772
773       .hcom  Macintosh  HCOM  files.   These  are (apparently) Mac FSSD files
774              with some variant of Huffman  compression.   The  Macintosh  has
775              wacky  file  formats  and this format handler apparently doesn't
776              handle all the ones it should.  Mac users will need their  usual
777              arsenal  of  file  converters to deal with an HCOM file on other
778              systems.
779
780       ircam (also with -t sndfile)
781              Another name for .sf.
782
783       .ima (also with -t sndfile)
784              A headerless file of IMA ADPCM  audio  data.  IMA  ADPCM  claims
785              16-bit  precision packed into only 4 bits, but in fact sounds no
786              better than .vox.
787
788       .mat, .mat4, .mat5 (libsndfile)
789              Matlab 4.2/5.0 (respectively GNU Octave 2.0/2.1) format (.mat is
790              the same as .mat4).
791
792       .maud  An  IFF-conforming audio file type, registered by MS MacroSystem
793              Computer GmbH, published along with the `Toccata' sound-card  on
794              the  Amiga.   Allows  8bit linear, 16bit linear, A-Law, μ-law in
795              mono and stereo.
796
797       .mp3, .mp2
798              MP3 compressed audio.  MP3 (MPEG Layer 3) is part  of  the  MPEG
799              standards  for  audio and video compression.  It is a lossy com‐
800              pression format that achieves good compression rates with little
801              quality loss.  See also Ogg Vorbis for a similar format.
802
803              MP3  support in SoX is optional and requires access to either or
804              both the external libmad and libmp3lame libraries.   To  see  if
805              there  is  support  for Mp3 run sox -h and look for it under the
806              list of supported file formats as `mp3'.
807
808
809       .nist (also with -t sndfile)
810              See .sph.
811
812       .ogg, .vorbis
813              Ogg Vorbis compressed audio.  Ogg Vorbis is a open,  patent-free
814              CODEC designed for compressing music and streaming audio.  It is
815              a lossy compression format (similar to  MP3,  VQF  &  AAC)  that
816              achieves good compression rates with a minimum amount of quality
817              loss.  See also MP3 for a similar format.
818
819              SoX can decode all types of Ogg Vorbis files, and can encode  at
820              different compression levels/qualities given as a number from -1
821              (highest compression/lowest quality) to 10 (lowest  compression,
822              highest  quality).   By  default the encoding quality level is 3
823              (which gives an encoded rate of approx. 112kbps), but  this  can
824              be changed using the -C option (see above) with a number from -1
825              to 10; fractional numbers (e.g.  3.6) are also allowed.
826
827              Decoding is somewhat CPU intensive  and  encoding  is  very  CPU
828              intensive.
829
830              Ogg  Vorbis  in  SoX is optional and requires access to external
831              Ogg Vorbis libraries.  To see if there is support for Ogg Vorbis
832              run sox -h and look for it under the list of supported file for‐
833              mats as `vorbis'.
834
835       ossdsp OSS /dev/dsp device driver.  This is a pseudo-file that  can  be
836              optionally  compiled  into SoX.  Run sox -h to see if it is sup‐
837              ported. When this driver is used  it  allows  you  to  play  and
838              record  sounds on supported systems. When playing audio files it
839              attempts to set up the OSS driver to use the same format as  the
840              input file. It is suggested to always override the output values
841              to use the highest quality format your OSS  system  can  handle.
842              Example: sox infile -t ossdsp -w -s /dev/dsp
843
844       .paf, .fap (libsndfile)
845              Ensoniq PARIS file format (big and little-endian respectively).
846
847       .prc   Psion  Record.  Used in some Psion devices for System alarms and
848              recordings made by the built-in Record application. This  format
849              is  newer  then  the .wve format that is also used in some Psion
850              devices.
851
852       .pvf (libsndfile)
853              Portable Voice Format.
854
855       .sd2 (libsndfile)
856              Sound Designer 2 format.
857
858       .sds (libsndfile)
859              MIDI Sample Dump Standard.
860
861       .sf (also with -t sndfile)
862              IRCAM  SDIF  (Institut  de  Recherche  et  Coordination   Acous‐
863              tique/Musique  Sound  Description  Interchange  Format). Used by
864              academic music software such as  the  CSound  package,  and  the
865              MixView sound sample editor.
866
867       .sph, .nist (also with -t sndfile)
868              SPHERE  (SPeech  HEader  Resources)  is a file format defined by
869              NIST (National Institute of Standards  and  Technology)  and  is
870              used with speech audio.  SoX can read these files when they con‐
871              tain μ-law and PCM data.  It will ignore any header  information
872              that  says  the data is compressed using shorten compression and
873              will treat the data as either μ-law or PCM.  This will allow SoX
874              and  the  command  line shorten program to be run together using
875              pipes to encompasses the data and then pass the  result  to  SoX
876              for processing.
877
878       .smp   Turtle Beach SampleVision files.  SMP files are for use with the
879              PC-DOS package SampleVision by  Turtle  Beach  Softworks.   This
880              package is for communication to several MIDI samplers.  All sam‐
881              ple rates are supported by the package,  although  not  all  are
882              supported by the samplers themselves.  Currently loop points are
883              ignored.
884
885       .snd   See .au.
886
887       sndfile
888              This is a pseudo-type that forces libsndfile to  be  used,  even
889              for  file  types normally handled internally by SoX. For writing
890              files, the actual file type is then taken from the  output  file
891              name;  for  reading  them,  it  is deduced from the file and any
892              other format parameters.  This pseudo-type depends on SoX having
893              been built with optional libsndfile support.
894
895       .sndt  SoundTool files. This is an older DOS file format.
896
897       .sou   An alias for the .ub format.
898
899       sunau  Sun  /dev/audio  device  driver.  This is a pseudo-file type and
900              can be optionally compiled into SoX.  Run sox -h to see  if  you
901              have  support  for  this file type.  When this driver is used it
902              allows you to open up a Sun /dev/audio file and configure it  to
903              use  the  same data type as passed in to SoX.  It works for both
904              playing and recording audio files.  When playing audio files  it
905              attempts  to  set  up the audio driver to use the same format as
906              the input file.  It is suggested to always override  the  output
907              values  to use the highest quality format your hardware can han‐
908              dle.  Example: sox infile -t  sunau  -w  -s  /dev/audio  or  sox
909              infile -t sunau -U -c 1 /dev/audio for older sun equipment.
910
911       .txw   Yamaha  TX-16W  sampler.   A  file format from a Yamaha sampling
912              keyboard which wrote IBM-PC format 3.5" floppies.  Handles read‐
913              ing  of files which do not have the sample rate field set to one
914              of  the  expected  by  looking  at  some  other  bytes  in   the
915              attack/loop  length fields, and defaulting to 33 kHz if the sam‐
916              ple rate is still unknown.
917
918       .vms   See .dvms.
919
920       .voc (also with -t sndfile)
921              Sound Blaster VOC files.  VOC files are multi-part  and  contain
922              silence parts, looping, and different sample rates for different
923              chunks.  On input, the silence parts are filled out,  loops  are
924              rejected,  and  sample  data with a new sample rate is rejected.
925              Silence with a different sample rate is generated appropriately.
926              On  output,  silence  is not detected, nor are impossible sample
927              rates.  Note, this version now supports playing VOC  files  with
928              multiple  blocks and supports playing files containing μ-law and
929              A-law samples.
930
931       .vorbis
932              See .ogg.
933
934       .vox (also with -t sndfile)
935              A headerless file of  Dialogic/OKI  ADPCM  audio  data  commonly
936              comes  with the extension .vox.  This ADPCM data has 12-bit pre‐
937              cision packed into only 4-bits.
938
939       .w64 (libsndfile)
940              Sonic Foundry's 64-bit RIFF/WAV format.
941
942       .wav (also with -t sndfile)
943              Microsoft .WAV RIFF files.  This is the native audio file format
944              of Windows, and widely used for uncompressed audio.
945
946              Normally  .wav  files  have  all formatting information in their
947              headers, and so do not need any format options specified for  an
948              input file.  If any are, they will override the file header, and
949              you will be warned to this effect.  You had better know what you
950              are doing! Output format options will cause a format conversion,
951              and the .wav will written appropriately.
952
953              SoX currently can read PCM, μ-law, A-law, MS ADPCM, and IMA  (or
954              DVI)  ADPCM.   It  can  write all of these formats including the
955              ADPCM encoding.  Big endian versions of RIFF files, called RIFX,
956              can  also be read and written.  To write a RIFX file, use the -B
957              option with the output file options.
958
959       .wve   Psion 8-bit A-law.  Used on older Psion PDAs.
960
961       .xa    Maxis XA files.  These are 16-bit  ADPCM  audio  files  used  by
962              Maxis  games.   Writing  .xa  files  is currently not supported,
963              although adding write support should not be very difficult.
964
965       .xi (libsndfile)
966              Fasttracker 2 Extended Instrument format.
967

EFFECTS

969       Multiple effects may be applied to the audio  by  specifying  them  one
970       after another at the end of the command line.
971
972       Note:  Brackets  [  ]  are used to denote parameters that are optional,
973       braces { } to denote those that are both optional and  repeatable,  and
974       angle  brackets  <  >  to  denote  those  that  are  repeatable but not
975       optional.
976
977       allpass frequency width[h|o|q]
978              Apply a two-pole all-pass filter with central frequency (in  Hz)
979              frequency,  and  filter-width  width:  in Hz (the default, or if
980              appended with `h'), in octaves (if appended with `o'), or  as  a
981              Q-factor (if appended with `q').  An all-pass filter changes the
982              audio's frequency to phase  relationship  without  changing  its
983              frequency to amplitude relationship.  The filter is described in
984              detail in [1].
985
986              This effect supports the --octave global option.
987
988       band [-n] center [width[h|o|q]]
989              Apply a band-pass filter.  The frequency  response  drops  loga‐
990              rithmically  around  the center frequency.  The width in Hz (the
991              default, or if appended with `h'), in octaves (if appended  with
992              `o'),  or  as a Q-factor (if appended with `q'), gives the slope
993              of the drop.  The frequencies at center +  width  and  center  -
994              width  will be half of their original amplitudes.  band defaults
995              to a mode oriented to pitched audio,  i.e.  voice,  singing,  or
996              instrumental  music.   The -n (for noise) option uses the alter‐
997              nate mode for un-pitched audio (e.g. percussion).   Warning:  -n
998              introduces  a  power-gain of about 11dB in the filter, so beware
999              of output clipping.  band introduces noise in the shape  of  the
1000              filter, i.e. peaking at the center frequency and settling around
1001              it.
1002
1003              This effect supports the --octave global option.
1004
1005              See also filter for a bandpass filter with steeper shoulders.
1006
1007       bandpass|bandreject [-c] frequency width[h|o|q]
1008              Apply a two-pole Butterworth  band-pass  or  band-reject  filter
1009              with  central frequency (in Hz) frequency, and (3dB-point) band-
1010              width width: in Hz (the default, or if appended  with  `h'),  in
1011              octaves  (if  appended  with `o'), or as a Q-factor (if appended
1012              with `q').  The -c option applies only to bandpass and selects a
1013              constant skirt gain (peak gain = Q) instead of the default: con‐
1014              stant 0dB peak gain.  The filters roll off  at  6dB  per  octave
1015              (20dB per decade) and are described in detail in [1].
1016
1017              These effects support the --octave global option.
1018
1019              See also filter for a bandpass filter with steeper shoulders.
1020
1021       bandreject frequency width[h|o|q]
1022              Apply a band-reject filter.  See the description of the bandpass
1023              effect for details.
1024
1025       bass|treble gain [frequency [width[s|h|o|q]]]
1026              Boost or cut the bass (lower) or treble (upper)  frequencies  of
1027              the audio using a two-pole shelving filter with a response simi‐
1028              lar to that of a  standard  hi-fi's  (Baxandall)  tone-controls.
1029              This is also known as shelving equalisation (EQ).
1030
1031              gain  gives  the dB gain at 0 Hz (for bass), or whichever is the
1032              lower of ∼22 kHz and the Nyquist frequency  (for  treble).   Its
1033              useful  range is about -20 (for a large cut) to +20 (for a large
1034              boost).  Beware of Clipping when using a positive gain.
1035
1036              If desired, the filter can be  fine-tuned  using  the  following
1037              optional parameters:
1038
1039              frequency sets the filter's central frequency and so can be used
1040              to extend or reduce the frequency range to be  boosted  or  cut.
1041              The default value is 100 Hz (for bass) or 3 kHz (for treble).
1042
1043              width  determines how steep the filter's shelf transition is and
1044              can be expressed as: a `slope' (the default, or if appended with
1045              `s'), a Q-factor (if appended with `q'), the transition width in
1046              octaves (if appended with `o'), or the transition  width  in  Hz
1047              (if  appended  with  `h').  The useful range of `slope' is about
1048              0.3, for a gentle slope, to 1 (the maximum), for a steep  slope;
1049              the default value is 0.5.
1050
1051              The filters are described in detail in [1].
1052
1053              These effects support the --octave global option.
1054
1055              See also equalizer for a peaking equalisation effect.
1056
1057       chorus gain-in gain-out <delay decay speed depth -s|-t>
1058              Add   a   chorus   effect   to   the   audio.   Each  four-tuple
1059              delay/decay/speed/depth gives the delay in milliseconds and  the
1060              decay  (relative to gain-in) with a modulation speed in Hz using
1061              depth in milliseconds.  The modulation is either sinusoidal (-s)
1062              or triangular (-t).  Gain-out is the volume of the output.
1063
1064       compand attack1,decay1{,attack2,decay2}
1065              in-dB1,out-dB1{,in-dB2,out-dB2}
1066              [gain [initial-volume [delay]]]
1067
1068              Compand  (compress  or  expand)  the dynamic range of the audio.
1069              The attack and decay time  specify  the  integration  time  over
1070              which  the  absolute  value of the input signal is integrated to
1071              determine its volume; attacks refer to increases in  volume  and
1072              decays  refer  to  decreases.   Where  more  than  one  pair  of
1073              attack/decay parameters are specified, each channel  is  treated
1074              separately and the number of pairs must agree with the number of
1075              input channels.  The second parameter is a list of points on the
1076              compander's  transfer  function  specified in dB relative to the
1077              maximum possible signal amplitude.  The input values must be  in
1078              a  strictly  increasing order but the transfer function does not
1079              have to be monotonically rising.  The special value -inf may  be
1080              used to indicate that the input volume should be associated out‐
1081              put volume.  The points -inf,-inf and 0,0 are assumed; the  lat‐
1082              ter may be overridden, but the former may not.
1083
1084              The  third  (optional) parameter is a post-processing gain in dB
1085              which is applied after the  compression  has  taken  place;  the
1086              fourth  (optional)  parameter is an initial volume to be assumed
1087              for each channel when the effect starts.  This permits the  user
1088              to  supply  a  nominal  level initially, so that, for example, a
1089              very large gain is not applied to initial signal  levels  before
1090              the companding action has begun to operate: it is quite probable
1091              that in such an event, the  output  would  be  severely  clipped
1092              while the compander gain properly adjusts itself.
1093
1094              The fifth (optional) parameter is a delay in seconds.  The input
1095              signal is analysed immediately to control the compander, but  it
1096              is  delayed before being fed to the volume adjuster.  Specifying
1097              a delay approximately equal to the attack/decay times allows the
1098              compander to effectively operate in a `predictive' rather than a
1099              reactive mode.
1100
1101              See also mcompand for a multiple-band companding effect.
1102
1103       dcshift shift [limitergain]
1104              DC Shift the audio, with basic linear amplitude  formula.   This
1105              is  most  useful if your audio tends to not be centered around a
1106              value of 0.  Shifting it back will allow you  to  get  the  most
1107              volume adjustments without clipping.
1108
1109              The  first  option is the dcshift value.  It is a floating point
1110              number that indicates the amount to shift.
1111
1112              An optional limitergain can be specified  as  well.   It  should
1113              have  a  value  much less than 1 (e.g. 0.05 or 0.02) and is used
1114              only on peaks to prevent clipping.
1115
1116       deemph Apply a treble attenuation shelving filter to audio in  audio-CD
1117              format.   The frequency response of pre-emphasized recordings is
1118              rectified.  The filter is defined in the standard  document  ISO
1119              908.
1120
1121              This effect supports the --octave global option.
1122
1123              See also the bass and treble shelving equalisation effects.
1124
1125       dither [depth]
1126              Apply dithering to the audio.  Dithering deliberately adds digi‐
1127              tal white noise to the signal in order to mask audible quantiza‐
1128              tion  effects  that  can occur if the output sample size is less
1129              than 24 bits.  By default, the amount of noise added is  ½  bit;
1130              the optional depth parameter is a (linear or voltage) multiplier
1131              of this amount.
1132
1133              This effect should not be followed  by  any  other  effect  that
1134              affects the audio.
1135
1136       earwax Makes  audio  easier to listen to on headphones.  Adds `cues' to
1137              audio in audio-CD format so that when listened to on  headphones
1138              the  stereo  image  is moved from inside your head (standard for
1139              headphones) to outside and in front of  the  listener  (standard
1140              for  speakers).  See http://www.geocities.com/beinges for a full
1141              explanation.
1142
1143       echo gain-in gain-out <delay decay>
1144              Add echoing to the audio.  Each delay decay pair gives the delay
1145              in  milliseconds  and  the  decay  (relative to gain-in) of that
1146              echo.  Gain-out is the volume of the output.
1147
1148       echos gain-in gain-out <delay decay>
1149              Add a sequence of echos to the audio.   Each  delay  decay  pair
1150              gives the delay in milliseconds and the decay (relative to gain-
1151              in) of that echo.  Gain-out is the volume of the output.
1152
1153       equalizer frequency width[q|o|h] gain
1154              Apply a two-pole peaking equalisation (EQ)  filter.   With  this
1155              filter,  the signal-level at and around a selected frequency can
1156              be increased or decreased, whilst (unlike  band-pass  and  band-
1157              reject filters) that at all other frequencies is unchanged.
1158
1159              frequency gives the filter's central frequency in Hz, width, the
1160              band-width, as a Q-factor [2] (the default, or if appended  with
1161              `q'),  in  octaves (if appended with `o'), or in Hz (if appended
1162              with `h'), and gain the required  gain  or  attenuation  in  dB.
1163              Beware of Clipping when using a positive gain.
1164
1165              In order to produce complex equalisation curves, this effect can
1166              be given several times, each with a different central frequency.
1167
1168              The filter is described in detail in [1].
1169
1170              This effect supports the --octave global option.
1171
1172              See also bass and treble for shelving equalisation effects.
1173
1174       fade [type] fade-in-length [stop-time [fade-out-length]]
1175              Add a fade effect to the beginning, end, or both of the audio.
1176
1177              For fade-ins, this starts from the first sample  and  ramps  the
1178              volume  of  the  audio from 0 to full volume over fade-in-length
1179              seconds.  Specify 0 seconds if no fade-in is wanted.
1180
1181              For fade-outs, the audio will be truncated at stop-time and  the
1182              volume  will  be  ramped  from full volume down to 0 starting at
1183              fade-out-length seconds  before  the  stop-time.   If  fade-out-
1184              length  is not specified, it defaults to the same value as fade-
1185              in-length.  No fade-out is performed if stop-time is not  speci‐
1186              fied.
1187
1188              All  times  can be specified in either periods of time or sample
1189              counts.  To specify time periods use  the  format  hh:mm:ss.frac
1190              format.   To  specify using sample counts, specify the number of
1191              samples and append the letter `s' to the sample count (for exam‐
1192              ple `8000s').
1193
1194              An  optional  type  can be specified to change the type of enve‐
1195              lope.  Choices are q for quarter of a sine wave, h  for  half  a
1196              sine  wave,  t  for  linear  slope, l for logarithmic, and p for
1197              inverted parabola.  The default is a linear slope.
1198
1199       filter [low]-[high] [window-len [beta]]
1200              Apply a sinc-windowed lowpass, highpass, or bandpass  filter  of
1201              given  window length to the signal.  low refers to the frequency
1202              of the lower 6dB corner of the filter.  high refers to the  fre‐
1203              quency of the upper 6dB corner of the filter.
1204
1205              A  low-pass filter is obtained by leaving low unspecified, or 0.
1206              A high-pass filter is obtained by leaving high  unspecified,  or
1207              0, or greater than or equal to the Nyquist frequency.
1208
1209              The window-len, if unspecified, defaults to 128.  Longer windows
1210              give a sharper cutoff, smaller windows a more gradual cutoff.
1211
1212              The beta, if unspecified, defaults to 16.  This selects a Kaiser
1213              window.   You can select a Nuttall window by specifying anything
1214              ≤ 2 here.  For more discussion of beta, look under the  resample
1215              effect.
1216
1217
1218       flanger [delay depth regen width speed shape phase interp]
1219              Apply  a  flanging  effect  to  the  audio.   All parameters are
1220              optional (right to left).
1221
1222             ┌─────────────────────────────────────────────────────────────────┐
1223             │          Range     Default   Description                        │
1224             │delay     0 - 10       0      Base delay in milliseconds.        │
1225             │depth     0 - 10       2      Added swept delay in milliseconds. │
1226             │regen    -95 - 95      0      Percentage regeneration (delayed   │
1227             │                              signal feedback).                  │
1228             │width    0 - 100      71      Percentage of delayed signal mixed │
1229             │                              with original.                     │
1230             │speed    0.1 - 10     0.5     Sweeps per second (Hz).            │
1231             │shape                 sin     Swept wave shape: sine|triangle.   │
1232             │phase    0 - 100      25      Swept wave percentage phase-shift  │
1233             │                              for multi-channel (e.g. stereo)    │
1234             │                              flange; 0 = 100 = same phase on    │
1235             │                              each channel.                      │
1236             │interp                lin     Digital delay-line interpolation:  │
1237             │                              linear|quadratic.                  │
1238             └─────────────────────────────────────────────────────────────────┘
1239              See [3] for a detailed description of flanging.
1240
1241       highpass|lowpass [-1|-2] frequency [width[q|o|h]]
1242              Apply a high-pass or low-pass filter with 3dB  point  frequency.
1243              The  filter  can be either single-pole (with -1), or double-pole
1244              (the default, or with -2).  width applies  only  to  double-pole
1245              filters  and is the filter-width: as a Q-factor (the default, or
1246              if appended with `q'), in octaves (if appended with `o'), or  in
1247              Hz  (if  appended  with `h'); the default Q is 0.707 and gives a
1248              Butterworth response.  The filters roll off at 6dB per pole  per
1249              octave  (20dB per pole per decade).  The double-pole filters are
1250              described in detail in [1].
1251
1252              These effects support the --octave global option.
1253
1254              See also filter for filters with a steeper roll-off.
1255
1256       lowpass [-1|-2] frequency [width[q|o|h]]
1257              Apply a low-pass filter.  See the description  of  the  highpass
1258              effect for details.
1259
1260       mcompand "attack1,decay1{,attack2,decay2}
1261              in-dB1,out-dB1{,in-dB2,out-dB2}
1262              [gain [initial-volume [delay]] ]" xover-freq
1263
1264              Multi-band compander is similar to the single band compander but
1265              the audio is first divided up into bands and then the  compander
1266              is  run  on each band. See the compand effect for the definition
1267              of its options. Compand options  are  specified  between  double
1268              quotes  and  the  crossover frequency for that band is specified
1269              separately with xover-freq. This can be repeated multiple  times
1270              to create multiple bands.
1271
1272       mixer [ -l|-r|-f|-b|-1|-2|-3|-4|n{,n} ]
1273              Reduce the number of audio channels by mixing or selecting chan‐
1274              nels, or increase the number of channels  by  duplicating  chan‐
1275              nels.   Note:  this effect operates on the audio channels within
1276              the SoX effects processing chain; it should not be confused with
1277              the  -m  global  option  (where  multiple files are mix-combined
1278              before entering the effects chain).
1279
1280              This effect is automatically used when the number of input chan‐
1281              nels  differ  from the number of output channels.  When reducing
1282              the number of channels it is possible to  manually  specify  the
1283              mixer effect and use the -l, -r, -f, -b, -1, -2, -3, -4, options
1284              to select only the left, right, front, back channel(s)  or  spe‐
1285              cific  channel for the output instead of averaging the channels.
1286              The -l, and -r options will do averaging in  quad-channel  files
1287              so select the exact channel to prevent this.
1288
1289              The mixer effect can also be invoked with up to 16 numbers, sep‐
1290              arated by commas, which specify the proportion (0 = 0% and  1  =
1291              100%) of each input channel that is to be mixed into each output
1292              channel.  In two-channel mode, 4 numbers are given: l → l,  l  →
1293              r,  r  →  l, and r → r, respectively.  In four-channel mode, the
1294              first 4 numbers give the proportions for the  left-front  output
1295              channel,  as  follows:  lf  → lf, rf → lf, lb → lf, and rb → rf.
1296              The next 4 give the right-front output in the same  order,  then
1297              left-back and right-back.
1298
1299              It  is  also  possible to use the 16 numbers to expand or reduce
1300              the channel count; just specify 0 for unused channels.
1301
1302              Finally, certain reduced combination of numbers can be specified
1303              for certain input/output channel combinations.
1304
1305                  ┌──────────────────────────────────────────────────────┐
1306                  │In Ch   Out Ch   Num   Mappings                       │
1307                  │  2       1       2    l → l, r → l                   │
1308                  │  2       2       1    adjust balance                 │
1309                  │  4       1       4    lf → l, rf → l, lb → l, rb → l │
1310                  │  4       2       2    lf → l&rf → r, lb → l&rb → r   │
1311                  │  4       4       1    adjust balance                 │
1312                  │  4       4       2    front balance, back balance    │
1313                  └──────────────────────────────────────────────────────┘
1314
1315       noiseprof [profile-file]
1316              Calculate  a  profile  of  the audio for use in noise reduction.
1317              See the description of the noisered effect for details.
1318
1319       noisered profile-file [threshold]
1320              Noise reduction filter with profiling.  This  filter  is  moder‐
1321              ately  effective at removing consistent background noise such as
1322              hiss or hum.  To use it, first run the  noiseprof  effect  on  a
1323              section  of audio that ideally would contain silence but in fact
1324              contains noise.  The noiseprof effect will  write  out  a  noise
1325              profile  to  profile-file,  or  to  stdout if no profile-file is
1326              specified.  If there is audio output on stdout then the  profile
1327              will instead be directed to stderr.
1328
1329              To  actually  remove  the noise, run SoX again with the noisered
1330              filter.  The filter needs  one  parameter,  profile-file,  which
1331              contains  the noise profile from noiseprof.  threshold specifies
1332              how much noise should be removed, and may be  between  0  and  1
1333              with a default of 0.5.  Higher values will remove more noise but
1334              present a greater likelihood of  distorting  the  desired  audio
1335              signal.   Experiment with different threshold values to find the
1336              optimal one for your audio.
1337
1338       pad { length[@position] }
1339              Pad the audio with silence, at the beginning, the  end,  or  any
1340              specified  points  through  the audio.  Both length and position
1341              can specify a time or, if appended with an `s', a number of sam‐
1342              ples.   length  is  the amount of silence to insert and position
1343              the position in the input audio stream at which  to  insert  it.
1344              Any  number  of lengths and positions may be specified, provided
1345              that a specified position is not less  that  the  previous  one.
1346              position  is  optional  for the first and last lengths specified
1347              and if omitted correspond to the beginning and the  end  of  the
1348              audio  respectively.   For example: pad 1.5 1.5 adds 1.5 seconds
1349              of silence  padding  at  each  end  of  the  audio,  whilst  pad
1350              4000s@3:00  inserts  4000  samples of silence 3 minutes into the
1351              audio.  If silence is wanted only at the end of the audio, spec‐
1352              ify  either the end position or specify a zero-length pad at the
1353              start.
1354
1355       pan direction
1356              Pan the audio from one channel to  another.   This  is  done  by
1357              changing  the  volume of the input channels so that it fades out
1358              on one channel and fades-in on another.  If the number of  input
1359              channels  is  different  then the number of output channels then
1360              this effect tries to intelligently handle this.   For  instance,
1361              if  the input contains 1 channel and the output contains 2 chan‐
1362              nels, then it will  create  the  missing  channel  itself.   The
1363              direction is a value from -1 to 1.  -1 represents far left and 1
1364              represents far right.  Numbers in between  will  start  the  pan
1365              effect without totally muting the opposite channel.
1366
1367       phaser gain-in gain-out delay decay speed [-s|-t]
1368              Add  a phasing effect to the audio.  delay/decay/speed gives the
1369              delay in milliseconds and the decay (relative to gain-in) with a
1370              modulation  speed  in  Hz.   The modulation is either sinusoidal
1371              (-s) or triangular (-t).  The decay should be less than  0.5  to
1372              avoid feedback.  Gain-out is the volume of the output.
1373
1374       pitch shift [width interpolate fade]
1375              Change  the  pitch  of  file  without  affecting its duration by
1376              cross-fading shifted samples.  shift is given in cents.   Use  a
1377              positive  value  to  shift to treble, negative value to shift to
1378              bass.  Default shift is 0.  width of window is in  ms.   Default
1379              width  is  20ms.   Try  30ms  to  lower pitch, and 10ms to raise
1380              pitch.  interpolate option, can be cubic or linear.  Default  is
1381              cubic.   The  fade option, can be cos, hamming, linear or trape‐
1382              zoid; the default is cos.
1383
1384       polyphase [-w nut|ham] [-width n] [-cutoff c]
1385              Change the sampling rate using `polyphase interpolation', a  DSP
1386              algorithm.  This method is relatively slow and memory intensive.
1387
1388              If  the  -w  parameter is nut, then a Nuttall (~90 dB stop-band)
1389              window will be used; ham selects a Hamming  (~43  dB  stop-band)
1390              window.  The default is Nuttall.
1391
1392              The  -width  parameter  specifies the (approximate) width of the
1393              filter. The default is 1024 samples, which  produces  reasonable
1394              results.
1395
1396              The  -cutoff  value (c) specifies the filter cutoff frequency in
1397              terms of fraction of  frequency  bandwidth,  also  know  as  the
1398              Nyquist frequency.  See the resample effect for further informa‐
1399              tion on Nyquist frequency.  If up-sampling,  then  this  is  the
1400              fraction  of  the  original  signal  that should go through.  If
1401              down-sampling, this is the fraction of  the  signal  left  after
1402              down-sampling.  The default is 0.95.
1403
1404              See  also  rabbit  and  resample  for other sample-rate changing
1405              effects.
1406
1407       rabbit [-c0|-c1|-c2|-c3|-c4]
1408              Change the sampling rate using `libsamplerate',  also  known  as
1409              `Secret  Rabbit  Code'.   This  effect is optional and must have
1410              been selected at compile  time  of  SoX.   See  http://www.mega-
1411              nerd.com/SRC  for  details  of  the  algorithms.   Algorithms  0
1412              through 2 are progressively faster and lower quality versions of
1413              the  sinc  algorithm;  the default is -c0, which is probably the
1414              best quality algorithm for general use  currently  available  in
1415              SoX.  Algorithm 3 is zero-order hold, and 4 is linear interpola‐
1416              tion.
1417
1418              See also polyphase and resample for other  sample-rate  changing
1419              effects, and see resample for more discussion of resampling.
1420
1421       repeat count
1422              Repeat  the  entire  audio  count times.  Requires disk space to
1423              store the data to be repeated.  Note that repeating once  yields
1424              two copies: the original audio and the repeated audio.
1425
1426       resample [-qs|-q|-ql] [rolloff [beta]]
1427              Change  the  sampling  rate  using  simulated analog filtration.
1428              Other rate changing effects available are polyphase and  rabbit.
1429              There  is  a  detailed  analysis  of  resample  and polyphase at
1430              http://leute.server.de/wilde/resample.html;  see  rabbit  for  a
1431              pointer to its own documentation.
1432
1433              By  default,  linear  interpolation is used, with a window width
1434              about 45 samples at the lower of the two rates.  This  gives  an
1435              accuracy  of about 16 bits, but insufficient stop-band rejection
1436              in the case that you want to have roll-off  greater  than  about
1437              0.8 of the Nyquist frequency.
1438
1439              The  -q* options will change the default values for roll-off and
1440              beta as well as use quadratic interpolation  of  filter  coeffi‐
1441              cients,  resulting  in about 24 bits precision.  The -qs, -q, or
1442              -ql options specify increased accuracy at the cost of lower exe‐
1443              cution  speed.   It  is  optional  to  specify roll-off and beta
1444              parameters when using the -q* options.
1445
1446              Following is a table of the reasonable defaults which are built-
1447              in to SoX:
1448
1449
1450                    ┌──────────────────────────────────────────────────┐
1451                    │Option   Window   Roll-off   Beta   Interpolation │
1452                    │(none)     45       0.80      16       linear     │
1453                    │ -qs       45       0.80      16      quadratic   │
1454                    │  -q       75      0.875      16      quadratic   │
1455                    │ -ql      149       0.94      16      quadratic   │
1456                    └──────────────────────────────────────────────────┘
1457              -qs,  -q,  or  -ql use window lengths of 45, 75, or 149 samples,
1458              respectively, at the lower sample-rate of the two  files.   This
1459              means  progressively sharper stop-band rejection, at proportion‐
1460              ally slower execution times.
1461
1462              rolloff refers to the cut-off frequency of the low  pass  filter
1463              and  is  given  in  terms of the Nyquist frequency for the lower
1464              sample rate.  rolloff therefore should be  something  between  0
1465              and 1, in practise 0.8-0.95.  The defaults are indicated above.
1466
1467              The  Nyquist  frequency is equal to half the sample rate.  Logi‐
1468              cally, this is because the A/D converter needs at least  2  sam‐
1469              ples  to  detect  1 cycle at the Nyquist frequency.  Frequencies
1470              higher then the Nyquist will actually appear as  lower  frequen‐
1471              cies to the A/D converter and is called aliasing.  Normally, A/D
1472              converts run the signal through a lowpass filter first to  avoid
1473              these problems.
1474
1475              Similar  problems will happen in software when reducing the sam‐
1476              ple rate of an audio file (frequencies  above  the  new  Nyquist
1477              frequency  can  be  aliased to lower frequencies).  Therefore, a
1478              good resample effect will remove all frequency information above
1479              the new Nyquist frequency.
1480
1481              The  rolloff  refers  to how close to the Nyquist frequency this
1482              cutoff is, with closer being better.  When increasing the sample
1483              rate  of an audio file you would not expect to have any frequen‐
1484              cies  exist  that  are  past  the  original  Nyquist  frequency.
1485              Because  of resampling properties, it is common to have aliasing
1486              artifacts created above the old Nyquist frequency.  In that case
1487              the  rolloff  refers  to  how close to the original Nyquist fre‐
1488              quency to use a highpass filter to remove these artifacts,  with
1489              closer also being better.
1490
1491              The  beta  parameter  determines the type of filter window used.
1492              Any value greater than 2 is the beta for a Kaiser window.   Beta
1493              ≤  2 selects a Nuttall window.  If unspecified, the default is a
1494              Kaiser window with beta 16.
1495
1496              In the case of Kaiser window (beta > 2), lower betas  produce  a
1497              somewhat  faster  transition from pass-band to stop-band, at the
1498              cost of noticeable artifacts.  A beta of 16 is the default, beta
1499              less  than 10 is not recommended.  If you want a sharper cutoff,
1500              don't use low beta's, use a longer  sample  window.   A  Nuttall
1501              window is selected by specifying any `beta' ≤ 2, and the Nuttall
1502              window has somewhat steeper cutoff than the default Kaiser  win‐
1503              dow.   You  will  probably not need to use the beta parameter at
1504              all, unless you are just curious about comparing the effects  of
1505              Nuttall vs. Kaiser windows.
1506
1507              This  is the default effect if the two files have different sam‐
1508              pling rates.  Default parameters are, as indicated above, Kaiser
1509              window  of  length 45, roll-off 0.80, beta 16, linear interpola‐
1510              tion.
1511
1512              Note: -qs is only slightly slower, but more accurate for  16-bit
1513              or higher precision.
1514
1515              Note:  In many cases of up-sampling, no interpolation is needed,
1516              as exact filter coefficients can be  computed  in  a  reasonable
1517              amount of space.  To be precise, this is done when
1518
1519                                  input-rate < output-rate
1520                                            and
1521                      output-rate ÷ gcd(input-rate, output-rate) ≤ 511
1522
1523       reverb gain-out reverb-time <delay>
1524              Add  reverberation  to  the  audio.  Each delay is given in mil‐
1525              liseconds and its feedback is depending on  the  reverb-time  in
1526              milliseconds.   Each  delay  should  be  in the range of half to
1527              quarter of reverb-time to get a realistic reverberation.   gain-
1528              out is the volume of the output.
1529
1530       reverse
1531              Reverse  the audio completely.  Requires disk space to store the
1532              data to be reversed.
1533
1534       silence above-periods [duration threshold[d|%] [below-periods  duration
1535       threshold[d|%]]
1536
1537              Removes silence from the beginning, middle, or end of the audio.
1538              Silence is anything below a specified threshold.
1539
1540              The above-periods value is used to indicate if audio  should  be
1541              trimmed  at  the  beginning of the audio.  A value of zero indi‐
1542              cates no silence should be trimmed  from  the  beginning.   When
1543              specifying an non-zero above-periods, it trims audio up until it
1544              finds non-silence.  Normally, when trimming silence from  begin‐
1545              ning  of  audio  the  above-periods  will  be  1  but  it can be
1546              increased to higher values to trim all audio up  to  a  specific
1547              count  of non-silence periods.  For example, if you had an audio
1548              file with two songs that each contained  2  seconds  of  silence
1549              before the song, you could specify an above-period of 2 to strip
1550              out both silence periods and the first song.
1551
1552              When above-periods is non-zero, you must also specify a duration
1553              and  threshold.   Duration  indications  the amount of time that
1554              non-silence must be detected before it stops trimming audio.  By
1555              increasing  the  duration,  burst  of  noise  can  be treated as
1556              silence and trimmed off.
1557
1558              Threshold is used to indicate what sample value you should treat
1559              as silence.  For digital audio, a value of 0 may be fine but for
1560              audio recorded from analog, you may wish to increase  the  value
1561              to account for background noise.
1562
1563              When  optionally trimming silence from the end of the audio, you
1564              specify a below-periods count.  In this case, below-period means
1565              to  remove  all audio after silence is detected.  Normally, this
1566              will be a value 1 of but it can be increased to skip over  peri‐
1567              ods of silence that are wanted.  For example, if you have a song
1568              with 2 seconds of silence in the middle and 2 second at the end,
1569              you  could  set  below-period  to  a value of 2 to skip over the
1570              silence in the middle of the audio.
1571
1572              For below-periods, duration specifies a period of  silence  that
1573              must exist before audio is not copied any more.  By specifying a
1574              higher duration, silence that is  wanted  can  be  left  in  the
1575              audio.   For example, if you have a song with an expected 1 sec‐
1576              ond of silence in the middle and 2 seconds  of  silence  at  the
1577              end, a duration of 2 seconds could be used to skip over the mid‐
1578              dle silence.
1579
1580              Unfortunately, you must know the length of the  silence  at  the
1581              end  of  your  audio  file to trim off silence reliably.  A work
1582              around is to use the silence  effect  in  combination  with  the
1583              reverse  effect.   By first reversing the audio, you can use the
1584              above-periods to reliably trim all audio from  what  looks  like
1585              the  front of the file.  Then reverse the file again to get back
1586              to normal.
1587
1588              To remove silence from the middle of a file,  specify  a  below-
1589              periods that is negative.  This value is then treated as a posi‐
1590              tive value and is  also  used  to  indicate  the  effect  should
1591              restart  processing as specified by the above-periods, making it
1592              suitable for removing periods of silence in the  middle  of  the
1593              audio.
1594
1595              The  period counts are in units of samples.  Duration counts may
1596              be in the format of hh:mm:ss.frac, or the exact  count  of  sam‐
1597              ples.   Threshold numbers may be suffixed with d to indicate the
1598              value is in decibels, or % to indicate a percentage  of  maximum
1599              value of the sample value (0% specifies pure digital silence).
1600
1601       speed factor[c]
1602              Adjust  the  audio  speed (pitch and tempo together).  factor is
1603              either the ratio of the new speed to the old speed: greater than
1604              1  speeds  up,  less than 1 slows down, or, if appended with the
1605              letter `c', the number of cents (i.e. 100ths of a  semitone)  by
1606              which  the  pitch (and tempo) should be adjusted: greater than 0
1607              increases, less than 0 decreases.
1608
1609              By default, the speed change is performed by the resample effect
1610              with  its default parameters.  For higher quality resampling, in
1611              addition to the speed effect, specify either the resample or the
1612              rabbit effect with appropriate parameters.
1613
1614       stat [-s n] [-rms] [-freq] [-v] [-d]
1615              Do  a  statistical check on the input file, and print results on
1616              the standard error file.  Audio is passed unmodified through the
1617              SoX processing chain.
1618
1619              The  `Volume  Adjustment:' field in the statistics gives you the
1620              parameter to the -v number which will make the audio as loud  as
1621              possible without clipping.  Note: See the discussion on Clipping
1622              above for reasons why it is rarely a good idea  to  actually  do
1623              this.
1624
1625              The  option  -v  will print out the `Volume Adjustment:' field's
1626              value only and return.  This could be of use in scripts to  auto
1627              convert the volume.
1628
1629              The -s option is used to scale the input data by a given factor.
1630              The default value of n is the max value of a signed  long  vari‐
1631              able  (0x7fffffff).   Internal  effects  always work with signed
1632              long PCM data and so the value should relate to this fact.
1633
1634              The -rms option will convert all output average values to  `root
1635              mean square' format.
1636
1637              The  -freq  option  calculates  the  input's  power spectrum and
1638              prints it to standard error.
1639
1640              There is also an optional parameter -d that will print out a hex
1641              dump  of  the  audio  from the internal buffer that is in 32-bit
1642              signed PCM data.  This is mainly only of use  in  tracking  down
1643              endian problems that creep in to SoX on cross-platform versions.
1644
1645       stretch factor [window fade shift fading]
1646              Time  stretch  the  audio by the given factor.  Changes duration
1647              without affecting the pitch.  factor of stretching: >1 lengthen,
1648              <1  shorten  duration.   window size is in ms.  Default is 20ms.
1649              The fade option, can be `lin'.  shift ratio, in [0 1].   Default
1650              depends  on  stretch factor. 1 to shorten, 0.8 to lengthen.  The
1651              fading ratio, in [0  0.5].   The  amount  of  a  fade's  default
1652              depends on factor and shift.
1653
1654       swap [1 2 | 1 2 3 4]
1655              Swap channels in multi-channel audio files.  Optionally, you may
1656              specify the channel order you would like the  output  in.   This
1657              defaults  to output channel 2 and then 1 for stereo and 2, 1, 4,
1658              3 for quad-channels.  An interesting feature  is  that  you  may
1659              duplicate  a given channel by overwriting another.  This is done
1660              by repeating an output channel on the command-line.   For  exam‐
1661              ple,  swap 2 2 will overwrite channel 1 with channel 2; creating
1662              a stereo file with both channels containing the same audio.
1663
1664       synth [len] {[type] [combine] [freq[-freq2]] [off] [ph] [p1] [p2] [p3]}
1665              This effect can be used to generate  fixed  or  swept  frequency
1666              audio  tones  with various wave shapes, or to generate wide-band
1667              noise of various `colours'.  Multiple synth effects can be  cas‐
1668              caded  to  produce  more  complex waveforms; at each stage it is
1669              possible to choose whether the generated waveform will be  mixed
1670              with,  or  modulated  onto  the  output from the previous stage.
1671              Audio for each channel in a multi-channel audio file can be syn‐
1672              thesised independently.
1673
1674              Though this effect is used to generate audio, an input file must
1675              still be given, the characteristics of which will be used to set
1676              the  synthesised  audio  length, the number of channels, and the
1677              sampling rate; however, since the input file's audio is not nor‐
1678              mally  needed, a `null file' (with the special name -n) is often
1679              given instead (and the length specified as a parameter to  synth
1680              or by another given effect that can has an associated length).
1681
1682              For example, the following produces a 3 second, 44.1 kHz, stereo
1683              audio file containing a sine-wave swept from 300 to 3300 Hz:
1684
1685                     sox -n output.au synth 3 sine 300-3300
1686
1687              and this produces an 8 kHz mono version:
1688
1689                     sox -r 8000 -c 1 -n output.au synth 3 sine 300-3300
1690
1691              Multiple channels can be synthesised by specifying  the  set  of
1692              parameters  shown  between  braces multiple times; the following
1693              puts the swept tone in the left channel and adds  `brown'  noise
1694              in the right:
1695
1696                     sox -n output.au synth 3 sine 300-3300 brownnoise
1697
1698              The  following  example  shows how two synth effects can be cas‐
1699              caded to create a more complex waveform:
1700
1701                     sox -n output.au synth 0.5 sine 200-500  synth  0.5  sine
1702              fmod 700-100
1703
1704              Frequencies  can  also be given as a number of musical semitones
1705              relative to `middle A' (440 Hz) by prefixing  a  `%'  character;
1706              for example, the following could be used to help tune a guitar's
1707              `E' strings:
1708
1709                     play -n synth sine %-17
1710
1711              N.B.  This effect generates audio at maximum volume, which means
1712              that  there  is  a  high chance of clipping when using the audio
1713              subsequently, so in most cases, you will  want  to  follow  this
1714              effect  with the vol effect to prevent this from happening. (See
1715              also Clipping above.)
1716
1717              A detailed description of each synth parameter follows:
1718
1719              len is the length of audio to synthesise expressed as a time  or
1720              as a number of samples; 0=inputlength, default=0.
1721
1722              The format for specifying lengths in time is hh:mm:ss.frac.  The
1723              format for specifying sample counts is  the  number  of  samples
1724              with the letter `s' appended to it.
1725
1726              type is one of sine, square, triangle, sawtooth, trapezium, exp,
1727              [white]noise, pinknoise, brownnoise; default=sine
1728
1729              combine is one of create, mix, amod (amplitude modulation), fmod
1730              (frequency modulation); default=create
1731
1732              freq/freq2 are the frequencies at the beginning/end of synthesis
1733              in Hz  or,  if  preceded  with  `%',  semitones  relative  to  A
1734              (440 Hz);  for  both,  default=%0.   If freq2 is given, then len
1735              must also have been given.  Not used for noise.
1736
1737              off is the bias (DC-offset) of the signal in percent; default=0.
1738
1739              ph is the phase shift in percentage of 1 cycle; default=0.   Not
1740              used for noise.
1741
1742              p1  is  the  percentage  of each cycle that is `on' (square), or
1743              `rising' (triangle, exp, trapezium); default=50 (square,  trian‐
1744              gle, exp), default=10 (trapezium).
1745
1746              p2  (trapezium):  the  percentage  through  each  cycle at which
1747              `falling' begins; default=50. exp:  the  amplitude  in  percent;
1748              default=100.
1749
1750              p3  (trapezium):  the  percentage  through  each  cycle at which
1751              `falling' ends; default=60.
1752
1753       treble gain [frequency [width[s|h|o|q]]]
1754              Apply a treble tone-control effect.  See the description of  the
1755              bass effect for details.
1756
1757       tremolo speed [depth]
1758              Apply  a  tremolo (low frequency amplitude modulation) effect to
1759              the audio.  The tremolo frequency in Hz is given by  speed,  and
1760              the depth as a percentage by depth (default 40).
1761
1762              Note: This effect is a special case of the synth effect.
1763
1764       trim start [length]
1765              Trim  can  trim off unwanted audio from the beginning and end of
1766              the audio.  Audio is not sent to the  output  stream  until  the
1767              start location is reached.
1768
1769              The  optional  length  parameter  tells the number of samples to
1770              output after the start sample and is used to trim off  the  back
1771              side  of  the audio.  Using a value of 0 for the start parameter
1772              will allow trimming off the back side only.
1773
1774              Both options can be specified using either an amount of time  or
1775              an exact count of samples.  The format for specifying lengths in
1776              time is hh:mm:ss.frac.  A start value of 1:30.5 will  not  start
1777              until 1 minute, thirty and ½ seconds into the audio.  The format
1778              for specifying sample counts is the number of samples  with  the
1779              letter  `s'  appended  to  it.  A value of 8000s will wait until
1780              8000 samples are read before starting to process audio.
1781
1782       vol gain [type [limitergain]]
1783              Apply an amplification or an attenuation to  the  audio  signal.
1784              Unlike the -v option (which is used for balancing multiple input
1785              files as they enter the SoX effects processing chain), vol is an
1786              effect  like  any  other so can be applied anywhere, and several
1787              times if necessary, during the processing chain.
1788
1789              The amount to change the volume is given by gain which is inter‐
1790              preted,  according  to  the  given  type, as follows: if type is
1791              amplitude (or is omitted), then gain is an amplitude (i.e. volt‐
1792              age  or  linear)  ratio, if power, then a power (i.e. wattage or
1793              voltage-squared) ratio, and if dB, then a power change in dB.
1794
1795              When type is amplitude or power, a gain of 1 leaves  the  volume
1796              unchanged,  less  than  1  decreases  it,  and  greater  than  1
1797              increases it; a negative gain inverts the audio signal in  addi‐
1798              tion to adjusting its volume.
1799
1800              When  type  is dB, a gain of 0 leaves the volume unchanged, less
1801              than 0 decreases it, and greater than 0 increases it.
1802
1803              See [4] for a detailed discussion on electrical (and hence audio
1804              signal) voltage and power ratios.
1805
1806              Beware of Clipping when the increasing the volume.
1807
1808              An  optional  limitergain value can be specified and should be a
1809              value much less than 1 (e.g. 0.05 or 0.02) and is used  only  on
1810              peaks  to  prevent clipping.  Not specifying this parameter will
1811              cause no limiter to be used.  In verbose mode, this effect  will
1812              display the percentage of the audio that needed to be limited.
1813
1814   Deprecated Effects
1815       The  following  effects  have  been renamed or have their functionality
1816       included in another effect.  They continue to work in this  version  of
1817       SoX but may be removed in future.
1818
1819       avg [ -l|-r|-f|-b|-1|-2|-3|-4|n{,n} ]
1820              Reduce the number of audio channels by mixing or selecting chan‐
1821              nels, or duplicate channels to increase the number of  channels.
1822              This effect is just an alias of the mixer effect and is retained
1823              for backwards compatibility only.
1824
1825       highp frequency
1826              Apply a high-pass filter.  This effect is just an alias for  the
1827              highpass  effect  used  with  its  -1 option; it is retained for
1828              backwards compatibility only.
1829
1830       lowp frequency
1831              Apply a low-pass filter.  This effect is just an alias  for  the
1832              lowpass effect used with its -1 option; it is retained for back‐
1833              wards compatibility only.
1834
1835       mask [depth]
1836              This effect is just a deprecated alias for  the  dither  effect,
1837              left for historical reasons.
1838
1839       pick [ -l|-r|-f|-b|-1|-2|-3|-4|n{,n} ]
1840              Pick  a  subset  of  channels to be copied into the output file.
1841              This effect is just an alias of the mixer effect and is retained
1842              for backwards compatibility only.
1843
1844       rate   Does  the  same  as  resample  with no parameters; it exists for
1845              backwards compatibility.
1846
1847       vibro speed [depth]
1848              This is a deprecated alias for the tremolo effect.   It  differs
1849              in  that  the depth parameter ranges from 0 to 1 and defaults to
1850              0.5.
1851

DIAGNOSTICS

1853       Exit status is 0 for no error, 1 if there is a problem  with  the  com‐
1854       mand-line parameters, or 2 if an error occurs during file processing.
1855

BUGS

1857       Please report any bugs found in this version of SoX to the mailing list
1858       (sox-users@lists.sourceforge.net).
1859

LICENSE

1877       Copyright  1991  Lance  Norskog  and  Sundry  Contributors.   Copyright
1878       1998-2007 by Chris Bagwell and SoX Contributors.
1879
1880       This program is free software; you can redistribute it and/or modify it
1881       under the terms of the GNU General Public License as published  by  the
1882       Free  Software  Foundation;  either  version 2, or (at your option) any
1883       later version.
1884
1885       This program is distributed in the hope that it  will  be  useful,  but
1886       WITHOUT  ANY  WARRANTY;  without  even  the  implied  warranty  of MER‐
1887       CHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU  General
1888       Public License for more details.
1889

AUTHORS

1891       Chris Bagwell (cbagwell@users.sourceforge.net).  Other authors and con‐
1892       tributors are listed in the AUTHORS file that is distributed  with  the
1893       source code.
1894
1895
1896
1897sox                            January 31, 2007                         SoX(1)