1SoX(7)                          Sound eXchange                          SoX(7)
2
3
4

NAME

6       SoX - Sound eXchange, the Swiss Army knife of audio manipulation
7

DESCRIPTION

9       This  manual  describes  SoX  supported  file  formats and audio device
10       types; the SoX manual set starts with sox(1).
11
12       Format types that can SoX can determine by  a  filename  extension  are
13       listed  with  their  names  preceded  by  a dot.  Format types that are
14       optionally built into SoX are marked `(optional)'.
15
16       Format types that can be handled by an external library via an optional
17       pseudo  file  type (currently sndfile or ffmpeg) are marked e.g. `(also
18       with -t sndfile)'.  This might be  useful  if  you  have  a  file  that
19       doesn't work with SoX's default format readers and writers, and there's
20       an external reader or writer for that format.
21
22       To see if SoX has support for an optional format or device,  enter  sox
23       -h and look for its name under the list: `AUDIO FILE FORMATS' or `AUDIO
24       DEVICE DRIVERS'.
25
26   SOX FORMATS & DEVICE DRIVERS
27       .raw (also with -t sndfile),
28       .f4, .f8,
29       .s1, .s2, .s3, .s4,
30       .u1, .u2, .u3, .u4,
31       .ul, .al, .lu, .la,
32       .sb, .sw, .ub, .uw
33              Raw (headerless) audio files.  For raw, the sample rate and  the
34              data  encoding  must be given using command-line format options;
35              for the other listed types, the sample  rate  defaults  to  8kHz
36              (but may be overridden), and the data encoding is defined by the
37              given suffix.  Thus f4 and f8 indicate files encoded  as  4  and
38              8-byte  (IEEE  single  and  double precision) floating point PCM
39              respectively; s1, s2, s3, and s4 indicate 1, 2,  3,  and  4-byte
40              signed  integer PCM respectively; u1, u2, u3, and u4 indicate 1,
41              2, 3, and 4-byte unsigned integer PCM respectively; ul indicates
42              `μ-law'  (byte),  al indicates `A-law' (byte), and lu and la are
43              inverse bit order `μ-law' and inverse bit order `A-law'  respec‐
44              tively.   sb, sw, ub, uw, and sl are aliases for s1, s2, u1, u2,
45              and s4 respectively.  For all raw formats, the number  of  chan‐
46              nels defaults to 1 (but may be overridden).
47
48              Headerless  audio  files on a SPARC computer are likely to be of
49              format ul;  on a Mac, they're likely to be u1 but with a  sample
50              rate of 11025 or 22050 Hz.
51
52              See .ima and .vox for raw ADPCM formats.
53
54       .8svx (also with -t sndfile)
55              Amiga 8SVX musical instrument description format.
56
57       .aiff, .aif (also with -t sndfile)
58              AIFF  files  used  on Apple Macs as well as older Apple IIc/IIgs
59              and SGI.  Currently, SoX's AIFF support does not include  multi‐
60              ple  audio  chunks,  or  the 8SVX musical instrument description
61              format.  AIFF files are multimedia archives and can have  multi‐
62              ple  audio and picture chunks.  You may need a separate archiver
63              to work with them.
64
65       .aiffc, .aifc (also with -t sndfile)
66              AIFF-C is a format based on AIFF that was created to allow  han‐
67              dling compressed audio.  It can also handle little endian uncom‐
68              pressed linear data that is often referred to as sowt  encoding.
69              This  encoding  has  also  become the defacto format produced by
70              modern Macs as well as iTunes on  any  platform.   AIFF-C  files
71              produced by other applications typically have the file extension
72              .aif and require looking at its header to detect the  true  for‐
73              mat.  The sowt encoding is the only encoding that SoX can handle
74              with this format.
75
76              AIFF-C is defined in DAVIC 1.4 Part 9 Annex B.  This  format  is
77              referred from ARIB STD-B24, which is specified for Japanese data
78              broadcasting.  Any private chunks are not supported.
79
80       alsa (optional)
81              Advanced Linux Sound Architecture device driver;  supports  both
82              playing  and  recording audio.  ALSA is only used in Linux-based
83              operating systems, though these often support OSS (see below) as
84              well.  Examples:
85                   sox infile -t alsa
86                   sox infile -t alsa default
87                   sox infile -t alsa hw:0
88                   sox -2 -t alsa hw:1 outfile
89              See also play(1) and rec(1).
90
91       .amb   Ambisonic  B-Format: a specialisation of .wav with between 3 and
92              16 channels of audio for use with  an  Ambisonic  decoder.   See
93              http://www.ambisonia.com/Members/mleese/file-format-for-b-format
94              for details.  It is up to the user to get the channels  together
95              in the right order and at the correct amplitude.
96
97       .amr-nb (optional)
98              Adaptive  Multi  Rate - Narrow Band speech codec; a lossy format
99              used in 3rd generation mobile telephony and defined in  3GPP  TS
100              26.071 et al.
101
102              AMR-NB  audio  has  a  fixed sampling rate of 8 kHz and supports
103              encoding to the following  bit-rates  (as  selected  by  the  -C
104              option):  0  = 4.75 kbit/s, 1 = 5.15 kbit/s, 2 = 5.9 kbit/s, 3 =
105              6.7 kbit/s, 4 = 7.4 kbit/s 5 = 7.95 kbit/s, 6 = 10.2 kbit/s, 7 =
106              12.2 kbit/s.
107
108       .amr-wb (optional)
109              Adaptive  Multi  Rate  -  Wide Band speech codec; a lossy format
110              used in 3rd generation mobile telephony and defined in  3GPP  TS
111              26.171 et al.
112
113              AMR-WB  audio  has  a fixed sampling rate of 16 kHz and supports
114              encoding to the following  bit-rates  (as  selected  by  the  -C
115              option):  0 = 6.6 kbit/s, 1 = 8.85 kbit/s, 2 = 12.65 kbit/s, 3 =
116              14.25 kbit/s, 4 = 15.85 kbit/s 5  =  18.25  kbit/s,  6  =  19.85
117              kbit/s, 7 = 23.05 kbit/s, 8 = 23.85 kbit/s.
118
119       ao (optional)
120              Xiph.org's  Audio  Output  device driver; works only for playing
121              audio.  It supports a wide range of devices and sound systems  -
122              see  its  documentation  for the full range.  For the most part,
123              SoX's use of libao cannot be configured directly; instead, libao
124              configuration files must be used.
125
126              The  filename  specified is used to determine which libao plugin
127              to use.  Normally, you should specify `default' as the filename.
128              If  that  doesn't give the desired behavior then you can specify
129              the short name for a given plugin (such as pulse for pulse audio
130              plugin).  Examples:
131                   sox infile -t ao
132                   sox infile -t ao default
133                   sox infile -t ao pulse
134              See also play(1).
135
136       .au, .snd (also with -t sndfile)
137              Sun Microsystems AU files.  There are many types of AU file; DEC
138              has invented its own with a  different  magic  number  and  byte
139              order.   To  write a DEC file, use the -L option with the output
140              file options.
141
142              Some .au files are known to have invalid AU headers;  these  are
143              probably  original Sun μ-law 8000 Hz files and can be dealt with
144              using the .ul format (see below).
145
146              It is possible to override AU file header information  with  the
147              -r  and  -c  options,  in which case SoX will issue a warning to
148              that effect.
149
150       .avr   Audio Visual Research format; used by  a  number  of  commercial
151              packages on the Mac.
152
153       .caf (optional)
154              Apple's Core Audio File format.
155
156       .cdda, .cdr
157              `Red Book' Compact Disc Digital Audio.  CDDA has two audio chan‐
158              nels formatted as 16-bit signed integers at  a  sample  rate  of
159              44.1 kHz.   The number of (stereo) samples in each CDDA track is
160              always a multiple of 588 which is why it needs its own handler.
161
162       coreaudio (optional)
163              Mac OSX CoreAudio  device  driver:  supports  both  playing  and
164              recording audio.  Examples:
165                   sox infile -t coreaudio
166                   sox infile -t coreaudio default
167              See also play(1) and rec(1).
168
169       .cvsd, .cvs
170              Continuously Variable Slope Delta modulation.  A headerless for‐
171              mat used to compress speech audio for applications such as voice
172              mail.  This format is sometimes used with bit-reversed samples -
173              the -X format option can be used to set the bit-order.
174
175       .cvu   Continuously Variable Slope Delta modulation (unfiltered).  This
176              is an alternative handler for CVSD that is unfiltered but can be
177              used with any bit-rate.  E.g.
178                   sox infile outfile.cvu rate 28k
179                   play -r 28k outfile.cvu filter -3.4k
180
181       .dat   Text Data files.  These files contain a  textual  representation
182              of  the  sample  data.   There is one line at the beginning that
183              contains the sample rate.  Subsequent lines contain two  numeric
184              data items: the time since the beginning of the first sample and
185              the sample value.  Values are normalized so that the maximum and
186              minimum  are  1  and -1.  This file format can be used to create
187              data files for external programs such as FFT analysers or  graph
188              routines.   SoX can also convert a file in this format back into
189              one of the other file formats.
190
191       .dvms, .vms
192              Used in Germany to compress speech  audio  for  voice  mail.   A
193              self-describing variant of cvsd.
194
195       .fap (optional)
196              See .paf.
197
198       ffmpeg (optional)
199              This  is a pseudo-type that forces ffmpeg to be used. The actual
200              file type is deduced from the file name (it cannot  be  used  on
201              stdio).   It  can  read  a wide range of audio files, not all of
202              which are documented here, and also  the  audio  track  of  many
203              video  files  (including AVI, WMV and MPEG). At present only the
204              first audio track of a file can be read.
205
206       .flac (optional; also with -t sndfile)
207              Xiph.org's Free Lossless Audio CODEC compressed audio.  FLAC  is
208              an  open,  patent-free CODEC designed for compressing music.  It
209              is similar to MP3 and Ogg Vorbis,  but  lossless,  meaning  that
210              audio is compressed in FLAC without any loss in quality.
211
212              SoX  can  read  native FLAC files (.flac) but not Ogg FLAC files
213              (.ogg).  [But see .ogg below for information relating to support
214              for Ogg Vorbis files.]
215
216              SoX  can write native FLAC files according to a given or default
217              compression level.  8 is the default compression level and gives
218              the  best  (but  slowest)  compression;  0  gives the least (but
219              fastest) compression.  The compression level is  selected  using
220              the -C option [see sox(1)] with a whole number from 0 to 8.
221
222       .fssd  An alias for the .u1 format.
223
224       .gsm (optional; also with -t sndfile)
225              GSM  06.10  Lossy  Speech  Compression.  A lossy format for com‐
226              pressing speech which is used in the Global Standard for  Mobile
227              telecommunications  (GSM).  It's good for its purpose, shrinking
228              audio data size, but it will introduce  lots  of  noise  when  a
229              given  audio signal is encoded and decoded multiple times.  This
230              format is used by some voice mail applications.   It  is  rather
231              CPU intensive.
232
233       .hcom  Macintosh  HCOM  files.   These  are Mac FSSD files with Huffman
234              compression.
235
236       .htk   Single channel 16-bit PCM format used  by  HTK,  a  toolkit  for
237              building Hidden Markov Model speech processing tools.
238
239       .ircam (also with -t sndfile)
240              Another name for .sf.
241
242       .ima (also with -t sndfile)
243              A  headerless  file  of  IMA  ADPCM audio data. IMA ADPCM claims
244              16-bit precision packed into only 4 bits, but in fact sounds  no
245              better than .vox.
246
247       .lpc, .lpc10
248              LPC-10  is  a  compression  scheme  for  speech developed in the
249              United  States.   See   http://www.arl.wustl.edu/~jaf/lpc/   for
250              details.  There is no associated file format, so SoX's implemen‐
251              tation is headerless.
252
253       .mat, .mat4, .mat5 (optional)
254              Matlab 4.2/5.0 (respectively GNU Octave 2.0/2.1) format (.mat is
255              the same as .mat4).
256
257       .m3u   A  playlist  format;  contains  a  list of audio files.  SoX can
258              read, but not write this file format.  See [1]  for  details  of
259              this format.
260
261       .maud  An  IFF-conforming audio file type, registered by MS MacroSystem
262              Computer GmbH, published along with the `Toccata' sound-card  on
263              the  Amiga.   Allows  8bit linear, 16bit linear, A-Law, μ-law in
264              mono and stereo.
265
266       .mp3, .mp2 (optional read, optional write)
267              MP3 compressed audio; MP3 (MPEG  Layer  3)  is  a  part  of  the
268              patent-encumbered  MPEG  standards  for audio and video compres‐
269              sion.  It is a lossy compression format that achieves good  com‐
270              pression rates with little quality loss.
271
272              Because MP3 is patented, SoX cannot be distributed with MP3 sup‐
273              port without incurring the  patent  holder's  fees.   Users  who
274              require  SoX  with  MP3 support must currently compile and build
275              SoX with the MP3 libraries (LAME & MAD) from source code.
276
277              See also Ogg Vorbis for a similar format.
278
279       .mp4, .m4a (optional)
280              MP4 compressed audio.  MP3 (MPEG 4) is part of  the  MPEG  stan‐
281              dards  for audio and video compression.  See mp3 for more infor‐
282              mation.
283
284       .nist (also with -t sndfile)
285              See .sph.
286
287       .ogg, .vorbis (optional)
288              Xiph.org's Ogg Vorbis compressed  audio;  an  open,  patent-free
289              CODEC  designed  for  music  and streaming audio.  It is a lossy
290              compression format (similar to MP3, VQF  &  AAC)  that  achieves
291              good compression rates with a minimum amount of quality loss.
292
293              SoX  can decode all types of Ogg Vorbis files, and can encode at
294              different compression levels/qualities given as a number from -1
295              (highest  compression/lowest quality) to 10 (lowest compression,
296              highest quality).  By default the encoding quality  level  is  3
297              (which  gives  an encoded rate of approx. 112kbps), but this can
298              be changed using the -C option (see above) with a number from -1
299              to  10; fractional numbers (e.g.  3.6) are also allowed.  Decod‐
300              ing is somewhat CPU intensive and encoding is  very  CPU  inten‐
301              sive.
302
303              See also .mp3 for a similar format.
304
305       oss (optional)
306              Open  Sound System /dev/dsp device driver; supports both playing
307              and recording audio.  OSS  support  is  available  in  Unix-like
308              operating  systems,  sometimes  together  with alternative sound
309              systems (such as ALSA).  Examples:
310                   sox infile -t oss
311                   sox infile -t oss /dev/dsp
312                   sox -2 -t oss /dev/dsp outfile
313              See also play(1) and rec(1).
314
315       .paf, .fap (optional)
316              Ensoniq PARIS file format (big and little-endian respectively).
317
318       .pls   A playlist format; contains a list  of  audio  files.   SoX  can
319              read,  but  not  write this file format.  See [2] for details of
320              this format.
321
322              Note: SoX support for SHOUTcast PLS relies  on  wget(1)  and  is
323              only  partially  supported:  it's necessary to specify the audio
324              type manually, e.g.
325                   play -t mp3 "http://a.server/pls?rn=265&file=filename.pls"
326              and SoX does not know about alternative  servers  -  hit  Ctrl-C
327              twice in quick succession to quit.
328
329       .prc   Psion  Record. Used in Psion EPOC PDAs (Series 5, Revo and simi‐
330              lar) for System alarms  and  recordings  made  by  the  built-in
331              Record  application.  When writing, SoX defaults to A-law, which
332              is recommended; if you must use ADPCM, then use the  -i  switch.
333              The  sound  quality is poor because Psion Record seems to insist
334              on frames of 800 samples or fewer, so that the ADPCM  CODEC  has
335              to  be  reset  at  every  800  frames, which causes the sound to
336              glitch every tenth of a second.
337
338       .pvf (optional)
339              Portable Voice Format.
340
341       .sd2 (optional)
342              Sound Designer 2 format.
343
344       .sds (optional)
345              MIDI Sample Dump Standard.
346
347       .sf (also with -t sndfile)
348              IRCAM  SDIF  (Institut  de  Recherche  et  Coordination   Acous‐
349              tique/Musique  Sound  Description  Interchange  Format). Used by
350              academic music software such as  the  CSound  package,  and  the
351              MixView sound sample editor.
352
353       .sph, .nist (also with -t sndfile)
354              SPHERE  (SPeech  HEader  Resources)  is a file format defined by
355              NIST (National Institute of Standards  and  Technology)  and  is
356              used with speech audio.  SoX can read these files when they con‐
357              tain μ-law and PCM data.  It will ignore any header  information
358              that  says  the data is compressed using shorten compression and
359              will treat the data as either μ-law or PCM.  This will allow SoX
360              and  the  command  line shorten program to be run together using
361              pipes to encompasses the data and then pass the  result  to  SoX
362              for processing.
363
364       .smp   Turtle Beach SampleVision files.  SMP files are for use with the
365              PC-DOS package SampleVision by  Turtle  Beach  Softworks.   This
366              package is for communication to several MIDI samplers.  All sam‐
367              ple rates are supported by the package,  although  not  all  are
368              supported by the samplers themselves.  Currently loop points are
369              ignored.
370
371       .snd   See .au, .sndr and .sndt.
372
373       sndfile (optional)
374              This is a pseudo-type that forces libsndfile  to  be  used.  For
375              writing  files, the actual file type is then taken from the out‐
376              put file name; for reading them, it is deduced from the file.
377
378       .sndr  Sounder files.  An MS-DOS/Windows format from  the  early  '90s.
379              Sounder files usually have the extension `.SND'.
380
381       .sndt  SoundTool  files.  An MS-DOS/Windows format from the early '90s.
382              SoundTool files usually have the extension `.SND'.
383
384       .sou   An alias for the .u1 raw format.
385
386       .sox   SoX's native uncompressed PCM format, intended for  storing  (or
387              piping)  audio  at  intermediate processing points (i.e. between
388              SoX invocations).  It has much in common with the  popular  WAV,
389              AIFF,  and  AU  uncompressed  PCM formats, but has the following
390              specific characteristics: the PCM samples are always  stored  as
391              32  bit  signed integers, the samples are stored (by default) as
392              `native endian', and the  number  of  samples  in  the  file  is
393              recorded as a 64-bit integer.  Comments are also supported.
394
395              See `Special Filenames' in sox(1) for examples of using the .sox
396              format with `pipes'.
397
398       sunau (optional)
399              Sun /dev/audio device driver; supports both playing and  record‐
400              ing audio.  For example:
401                   sox infile -t sunau /dev/audio
402              or
403                   sox infile -t sunau -U -c 1 /dev/audio
404              for older sun equipment.
405
406              See also play(1) and rec(1).
407
408       .txw   Yamaha  TX-16W  sampler.   A  file format from a Yamaha sampling
409              keyboard which wrote IBM-PC format 3.5" floppies.  Handles read‐
410              ing  of files which do not have the sample rate field set to one
411              of  the  expected  by  looking  at  some  other  bytes  in   the
412              attack/loop  length fields, and defaulting to 33 kHz if the sam‐
413              ple rate is still unknown.
414
415       .vms   See .dvms.
416
417       .voc (also with -t sndfile)
418              Sound Blaster VOC files.  VOC files are multi-part  and  contain
419              silence parts, looping, and different sample rates for different
420              chunks.  On input, the silence parts are filled out,  loops  are
421              rejected,  and  sample  data with a new sample rate is rejected.
422              Silence with a different sample rate is generated appropriately.
423              On  output,  silence  is not detected, nor are impossible sample
424              rates.  SoX supports reading (but not writing)  VOC  files  with
425              multiple   blocks,   and  files  containing  μ-law,  A-law,  and
426              2/3/4-bit ADPCM samples.
427
428       .vorbis
429              See .ogg.
430
431       .vox (also with -t sndfile)
432              A headerless file of  Dialogic/OKI  ADPCM  audio  data  commonly
433              comes  with the extension .vox.  This ADPCM data has 12-bit pre‐
434              cision packed into only 4-bits.
435
436              Note: some early Dialogic hardware does  not  always  reset  the
437              ADPCM encoder at the start of each vox file.  This can result in
438              clipping and/or DC offset problems when it comes to decoding the
439              audio.   Whilst little can be done about the clipping, a DC off‐
440              set can be removed by passing the decoded audio through a  high-
441              pass filter, e.g.:
442                   sox input.vox output.au highpass 10
443
444       .w64 (optional)
445              Sonic Foundry's 64-bit RIFF/WAV format.
446
447       .wav (also with -t sndfile)
448              Microsoft .WAV RIFF files.  This is the native audio file format
449              of Windows, and widely used for uncompressed audio.
450
451              Normally .wav files have all  formatting  information  in  their
452              headers,  and so do not need any format options specified for an
453              input file.  If any are, they will override the file header, and
454              you will be warned to this effect.  You had better know what you
455              are doing! Output format options will cause a format conversion,
456              and the .wav will written appropriately.
457
458              SoX  can read and write PCM, μ-law, A-law, MS ADPCM, and IMA (or
459              DVI) ADPCM.  Big endian versions of RIFF files, called RIFX, are
460              also  supported.   To  write a RIFX file, use the -B option with
461              the output file options.
462
463       .wavpcm
464              A non-standard, but widely used, variant of .wav.  Some applica‐
465              tions  cannot  read  a  standard WAV file header for PCM-encoded
466              data with sample-size greater than 16-bits or with more than two
467              channels,  but can read a non-standard WAV header.  It is likely
468              that such applications will eventually be updated to support the
469              standard  header,  but  in the mean time, this SoX format can be
470              used to create files with the non-standard  header  that  should
471              work with these applications.  (Note that SoX will automatically
472              detect and read WAV files with the non-standard header.)
473
474              The most common use of this file-type is likely to be along  the
475              following lines:
476                   sox infile.any -t wavpcm -s outfile.wav
477
478       .wv (optional)
479              WavPack  lossless audio compression.  Note that, when converting
480              .wav to this format and back again, the RIFF header is not  nec‐
481              essarily preserved losslessly (though the audio is).
482
483       .wve (also with -t sndfile)
484              Psion  8-bit A-law.  Used on Psion SIBO PDAs (Series 3 and simi‐
485              lar).  This format is deprecated in SoX, but will continue to be
486              used in libsndfile.
487
488       .xa    Maxis  XA  files.   These  are  16-bit ADPCM audio files used by
489              Maxis games.  Writing .xa  files  is  currently  not  supported,
490              although adding write support should not be very difficult.
491
492       .xi (optional)
493              Fasttracker 2 Extended Instrument format.
494

SEE ALSO

496       sox(1), soxi(1), libsox(3), octave(1), wget(1)
497
498       The SoX web page at http://sox.sourceforge.net
499       SoX scripting examples at http://sox.sourceforge.net/Docs/Scripts
500
501   References
502       [1]    Wikipedia, M3U, http://en.wikipedia.org/wiki/M3U
503
504       [2]    Wikipedia, PLS, http://en.wikipedia.org/wiki/PLS_(file_format)
505

AUTHORS

507       Chris Bagwell (cbagwell@users.sourceforge.net).  Other authors and con‐
508       tributors are listed in the AUTHORS file that is distributed  with  the
509       source code.
510
511
512
513soxformat                      October 28, 2008                         SoX(7)
Impressum