1POCKETSPHINX_CONTINUOUS(1)  General Commands Manual POCKETSPHINX_CONTINUOUS(1)
2
3
4

NAME

6       pocketsphinx_continuous  - Run speech recognition in continuous listen‐
7       ing mode
8

SYNOPSIS

10       pocketsphinx_continuous [-infile  filename.wav  ]  [  -inmic  yes  ]  [
11       options ]...
12

DESCRIPTION

14       This  program  opens  the  audio device or a file and waits for speech.
15       When it detects an utterance, it performs speech recognition on it.
16
17       To record from microphone and decode use
18
19       -inmic yes
20
21       To decode a 16kHz 16-bit mono WAV file use
22
23       -infile filename.wav
24
25       You can also specify -lm or -fsg or -kws depending on whether  you  are
26       using  a  statistical  language model or a finite-state grammar or look
27       for a keyphase.
28

OPTIONS

30       -adcdev
31              of audio device to use for input.
32
33       -agc   Automatic gain  control  for  c0  ('max',  'emax',  'noise',  or
34              'none')
35
36       -agcthresh
37              Initial threshold for automatic gain control
38
39       -allphone
40              phoneme decoding with phonetic lm
41
42       -allphone_ci
43              Perform  phoneme  decoding with phonetic lm and context-indepen‐
44              dent units only
45
46       -alpha Preemphasis parameter
47
48       -argfile
49              file giving extra arguments.
50
51       -ascale
52              Inverse of acoustic model scale for confidence score calculation
53
54       -aw    Inverse weight applied to acoustic scores.
55
56       -backtrace
57              Print results and backtraces to log file.
58
59       -beam  Beam width applied to every frame  in  Viterbi  search  (smaller
60              values mean wider beam)
61
62       -bestpath
63              Run bestpath (Dijkstra) search over word lattice (3rd pass)
64
65       -bestpathlw
66              Language model probability weight for bestpath search
67
68       -ceplen
69              Number of components in the input feature vector
70
71       -cmn   Cepstral  mean  normalization  scheme  ('current',  'prior',  or
72              'none')
73
74       -cmninit
75              Initial values (comma-separated) for cepstral mean when  'prior'
76              is used
77
78       -compallsen
79              Compute  all  senone  scores  in every frame (can be faster when
80              there are many senones)
81
82       -debug level for debugging messages
83
84       -dict  pronunciation dictionary (lexicon) input file
85
86       -dictcase
87              Dictionary is case sensitive (NOTE: case  insensitivity  applies
88              to ASCII characters only)
89
90       -dither
91              Add 1/2-bit noise
92
93       -doublebw
94              Use double bandwidth filters (same center freq)
95
96       -ds    Frame GMM computation downsampling ratio
97
98       -fdict word pronunciation dictionary input file
99
100       -feat  Feature stream type, depends on the acoustic model
101
102       -featparams
103              containing feature extraction parameters.
104
105       -fillprob
106              Filler word transition probability
107
108       -frate Frame rate
109
110       -fsg   format finite state grammar file
111
112       -fsgusealtpron
113              Add alternate pronunciations to FSG
114
115       -fsgusefiller
116              Insert filler words at each state.
117
118       -fwdflat
119              Run forward flat-lexicon search over word lattice (2nd pass)
120
121       -fwdflatbeam
122              Beam width applied to every frame in second-pass flat search
123
124       -fwdflatefwid
125              Minimum  number  of end frames for a word to be searched in fwd‐
126              flat search
127
128       -fwdflatlw
129              Language model probability weight for flat  lexicon  (2nd  pass)
130              decoding
131
132       -fwdflatsfwin
133              Window  of  frames  in  lattice to search for successor words in
134              fwdflat search
135
136       -fwdflatwbeam
137              Beam width applied to word exits in second-pass flat search
138
139       -fwdtree
140              Run forward lexicon-tree search (1st pass)
141
142       -hmm   containing acoustic model files.
143
144       -infile
145              file to transcribe.
146
147       -inmic Transcribe audio from microphone.
148
149       -input_endian
150              Endianness of input data, big or little, ignored if NIST  or  MS
151              Wav
152
153       -jsgf  grammar file
154
155       -keyphrase
156              to spot
157
158       -kws   file with keyphrases to spot, one per line
159
160       -kws_delay
161              Delay to wait for best detection score
162
163       -kws_plp
164              Phone loop probability for keyword spotting
165
166       -kws_threshold
167              Threshold for p(hyp)/p(alternatives) ratio
168
169       -latsize
170              Initial backpointer table size
171
172       -lda   containing transformation matrix to be applied to features (sin‐
173              gle-stream features only)
174
175       -ldadim
176              Dimensionality of output of feature  transformation  (0  to  use
177              entire matrix)
178
179       -lifter
180              Length of sin-curve for liftering, or 0 for no liftering.
181
182       -lm    trigram language model input file
183
184       -lmctl a set of language model
185
186       -lmname
187              language model in -lmctl to use by default
188
189       -logbase
190              Base in which all log-likelihoods calculated
191
192       -logfn to write log messages in
193
194       -logspec
195              Write out logspectral files instead of cepstra
196
197       -lowerf
198              Lower edge of filters
199
200       -lpbeam
201              Beam width applied to last phone in words
202
203       -lponlybeam
204              Beam width applied to last phone in single-phone words
205
206       -lw    Language model probability weight
207
208       -maxhmmpf
209              Maximum  number  of active HMMs to maintain at each frame (or -1
210              for no pruning)
211
212       -maxwpf
213              Maximum number of distinct word exits at each frame (or  -1  for
214              no pruning)
215
216       -mdef  definition input file
217
218       -mean  gaussian means input file
219
220       -mfclogdir
221              to log feature files to
222
223       -min_endfr
224              Nodes  ignored in lattice construction if they persist for fewer
225              than N frames
226
227       -mixw  mixture weights input file (uncompressed)
228
229       -mixwfloor
230              Senone mixture weights floor (applied to data from -mixw file)
231
232       -mllr  transformation to apply to means and variances
233
234       -mmap  Use memory-mapped I/O (if possible) for model files
235
236       -ncep  Number of cep coefficients
237
238       -nfft  Size of FFT
239
240       -nfilt Number of filter banks
241
242       -nwpen New word transition penalty
243
244       -pbeam Beam width applied to phone transitions
245
246       -pip   Phone insertion penalty
247
248       -pl_beam
249              Beam width applied to phone loop search for lookahead
250
251       -pl_pbeam
252              Beam width applied to phone loop transitions for lookahead
253
254       -pl_pip
255              Phone insertion penalty for phone loop
256
257       -pl_weight
258              Weight for phoneme lookahead penalties
259
260       -pl_window
261              Phoneme lookahead window size, in frames
262
263       -rawlogdir
264              to log raw audio files to
265
266       -remove_dc
267              Remove DC offset from each frame
268
269       -remove_noise
270              Remove noise with spectral subtraction in mel-energies
271
272       -remove_silence
273              Enables VAD, removes silence frames from processing
274
275       -round_filters
276              Round mel filter frequencies to DFT points
277
278       -samprate
279              Sampling rate
280
281       -seed  Seed for random number generator; if less than  zero,  pick  our
282              own
283
284       -sendump
285              dump (compressed mixture weights) input file
286
287       -senlogdir
288              to log senone score files to
289
290       -senmgau
291              to codebook mapping input file (usually not needed)
292
293       -silprob
294              Silence word transition probability
295
296       -smoothspec
297              Write out cepstral-smoothed logspectral files
298
299       -svspec
300              specification (e.g., 24,0-11/25,12-23/26-38 or 0-12/13-25/26-38)
301
302       -time  Print word times in file transcription.
303
304       -tmat  state transition matrix input file
305
306       -tmatfloor
307              HMM state transition probability floor (applied to -tmat file)
308
309       -topn  Maximum number of top Gaussians to use in scoring.
310
311       -topn_beam
312              Beam  width  used  to determine top-N Gaussians (or a list, per-
313              feature)
314
315       -toprule
316              rule for JSGF (first public rule is default)
317
318       -transform
319              Which type of transform to use  to  calculate  cepstra  (legacy,
320              dct, or htk)
321
322       -unit_area
323              Normalize mel filters to unit area
324
325       -upperf
326              Upper edge of filters
327
328       -uw    Unigram weight
329
330       -vad_postspeech
331              Num of silence frames to keep after from speech to silence.
332
333       -vad_prespeech
334              Num of speech frames to keep before silence to speech.
335
336       -vad_startspeech
337              Num of speech frames to trigger vad from silence to speech.
338
339       -vad_threshold
340              Threshold  for  decision  between noise and silence frames. Log-
341              ratio between signal level and noise level.
342
343       -var   gaussian variances input file
344
345       -varfloor
346              Mixture gaussian variance floor (applied to data from -var file)
347
348       -varnorm
349              Variance normalize each utterance (only if CMN == current)
350
351       -verbose
352              Show input filenames
353
354       -warp_params
355              defining the warping function
356
357       -warp_type
358              Warping function type (or shape)
359
360       -wbeam Beam width applied to word exits
361
362       -wip   Word insertion penalty
363
364       -wlen  Hamming window length
365

AUTHOR

367       Written by numerous people at CMU from 1994 onwards.  This manual  page
368       by David Huggins-Daines <dhuggins@cs.cmu.edu>
369
371       Copyright © 1994-2016 Carnegie Mellon University.  See the file LICENSE
372       included with this package for more information.
373

SEE ALSO

375       pocketsphinx_batch(1), sphinx_fe(1).
376
377
378
379                                  2016-04-01        POCKETSPHINX_CONTINUOUS(1)
Impressum