hunspell(1)

1hunspell(1)                 General Commands Manual                hunspell(1)
2
3
4

NAME

6       hunspell - spell checker, stemmer and morphological analyzer
7

SYNOPSIS

9       hunspell  [-1aDGHhLlmnOrstvwX]  [--check-url]  [--check-apostrophe] [-d
10       dict[,dict2,...]]  [--help]  [-i  enc]  [-p  dict]  [-vv]   [--version]
11       [text/OpenDocument/TeX/LaTeX/HTML/SGML/XML/nroff/troff file(s)]
12

DESCRIPTION

14       Hunspell  is fashioned after the Ispell program.  The most common usage
15       is "hunspell" or "hunspell filename".  Without filename parameter, hun‐
16       spell  checks  the  standard input.  Typing "cat" and "exsample" in two
17       input lines, results an asterisk (it means "cat" is a correct word) and
18       a line with corrections:
19
20              $ hunspell -d en_US
21              Hunspell 1.2.3
22              *
23              & exsample 4 0: example, examples, ex sample, ex-sample
24
25       Correct words signed with an '*', '+' or '-', unrecognized words signed
26       with '#' or '&' in output lines (see later).  (Close the standard input
27       with Ctrl-d on Unix/Linux and Ctrl-Z Enter or Ctrl-C on Windows.)
28
29       With  filename parameters, hunspell will display each word of the files
30       which does not appear in the dictionary at the top of  the  screen  and
31       allow  you to change it.  If there are "near misses" in the dictionary,
32       then they are also displayed on following  lines.   Finally,  the  line
33       containing  the word and the previous line are printed at the bottom of
34       the screen.  If your terminal can display in reverse  video,  the  word
35       itself  is highlighted.  You have the option of replacing the word com‐
36       pletely, or choosing one of the suggested words.  Commands  are  single
37       characters as follows (case is ignored):
38
39              R      Replace the misspelled word completely.
40
41              Space  Accept the word this time only.
42
43              A      Accept the word for the rest of this hunspell session.
44
45              I      Accept  the  word,  capitalized as it is in the file, and
46                     update private dictionary.
47
48              U      Accept the word, and add an uncapitalized (actually,  all
49                     lower-case) version to the private dictionary.
50
51              S      Ask a stem and a model word and store them in the private
52                     dictionary.  The stem will  be  accepted  also  with  the
53                     affixes of the model word.
54
55              0-n    Replace with one of the suggested words.
56
57              X      Write  the  rest of this file, ignoring misspellings, and
58                     start next file.
59
60              Q      Exit immediately and leave the file unchanged.
61
62              ^Z     Suspend hunspell.
63
64              ?      Give help screen.
65

OPTIONS

67       -1     Check only first field in lines (delimiter = tabulator).
68
69       -a     The -a option is intended to be used from other programs through
70              a  pipe.  In this mode, hunspell prints a one-line version iden‐
71              tification message, and then begins reading lines of input.  For
72              each input line, a single line is written to the standard output
73              for each word checked for spelling on the line.  If the word was
74              found  in the main dictionary, or your personal dictionary, then
75              the line contains only a '*'.  If the  word  was  found  through
76              affix  removal,  then  the line contains a '+', a space, and the
77              root word.  If the word was  found  through  compound  formation
78              (concatenation of two words, then the line contains only a '-'.
79
80              If the word is not in the dictionary, but there are near misses,
81              then the line contains an '&', a space, the misspelled  word,  a
82              space,  the  number  of  near  misses,  the number of characters
83              between the beginning of the line and the beginning of the  mis‐
84              spelled  word,  a  colon,  another space, and a list of the near
85              misses separated by commas and spaces.
86
87              Also, each near miss or guess is capitalized  the  same  as  the
88              input  word unless such capitalization is illegal; in the latter
89              case each near miss is capitalized correctly  according  to  the
90              dictionary.
91
92              Finally,  if  the  word  does  not appear in the dictionary, and
93              there are no near misses, then the line contains a '#', a space,
94              the  misspelled word, a space, and the character offset from the
95              beginning of the line.  Each sentence of text  input  is  termi‐
96              nated  with  an  additional blank line, indicating that hunspell
97              has completed processing the input line.
98
99              These output lines can be summarized as follows:
100
101              OK:    *
102
103              Root:  + <root>
104
105              Compound:
106                     -
107
108              Miss:  & <original> <count> <offset>: <miss>, <miss>, ...
109
110              None:  # <original> <offset>
111
112              For example, a dummy dictionary  containing  the  words  "fray",
113              "Frey",   "fry",  and  "refried"  might  produce  the  following
114              response to the command "echo 'frqy refries | hunspell -a":
115              (#) Hunspell 0.4.1 (beta), 2005-05-26
116              & frqy 3 0: fray, Frey, fry
117              & refries 1 5: refried
118
119              This mode is also suitable for interactive use when you want  to
120              figure  out  the  spelling  of  a  single  word (but this is the
121              default behavior of hunspell without -a, too).
122
123              When in the -a mode, hunspell will also accept lines  of  single
124              words  prefixed  with  any of '*', '&', '@', '+', '-', '~', '#',
125              '!', '%', '`', or '^'.  A line starting with '*' tells  hunspell
126              to  insert the word into the user's dictionary (similar to the I
127              command).  A line starting with '&' tells hunspell to insert  an
128              all-lowercase  version  of  the  word into the user's dictionary
129              (similar to the U command).  A line  starting  with  '@'  causes
130              hunspell  to  accept  this  word in the future (similar to the A
131              command).  A line starting with '+', followed immediately by tex
132              or nroff will cause hunspell to parse future input according the
133              syntax of that formatter.  A line consisting  solely  of  a  '+'
134              will place hunspell in TeX/LaTeX mode (similar to the -t option)
135              and '-' returns hunspell to nroff/troff mode (but these commands
136              are  obsolete).   However,  the  string  character  type  is not
137              changed; the '~' command must be used to do this.  A line start‐
138              ing with '~' causes hunspell to set internal parameters (in par‐
139              ticular, the default string character type) based on  the  file‐
140              name  given  in  the rest of the line.  (A file suffix is suffi‐
141              cient, but the period must be included.  Instead of a file  name
142              or  suffix, a unique name, as listed in the language affix file,
143              may be  specified.)   However,  the  formatter  parsing  is  not
144              changed;   the '+' command must be used to change the formatter.
145              A line prefixed with '#' will cause the personal  dictionary  to
146              be saved.  A line prefixed with '!' will turn on terse mode (see
147              below), and a line prefixed with '%'  will  return  hunspell  to
148              normal  (non-terse) mode.  A line prefixed with '`' will turn on
149              verbose-correction mode (see below); this mode can only be  dis‐
150              abled by turning on terse mode with '%'.
151
152              Any  input  following  the prefix characters '+', '-', '#', '!',
153              '%', or '`' is ignored, as is any input following  the  filename
154              on  a '~' line.  To allow spell-checking of lines beginning with
155              these characters, a line starting with '^'  has  that  character
156              removed  before  it is passed to the spell-checking code.  It is
157              recommended that programmatic interfaces prefix every data  line
158              with  an uparrow to protect themselves against future changes in
159              hunspell.
160
161              To summarize these:
162
163
164
165              *      Add to personal dictionary
166
167              @      Accept word, but leave out of dictionary
168
169              #      Save current personal dictionary
170
171              ~      Set parameters based on filename
172
173              +      Enter TeX mode
174
175              -      Exit TeX mode
176
177              !      Enter terse mode
178
179              %      Exit terse mode
180
181              `      Enter verbose-correction mode
182
183              ^      Spell-check rest of line
184
185              In terse mode, hunspell will not print lines beginning with '*',
186              '+', or '-', all of which indicate correct words.  This signifi‐
187              cantly improves running speed when the driving program is  going
188              to ignore correct words anyway.
189
190              In  verbose-correction mode, hunspell includes the original word
191              immediately after the indicator character in output lines begin‐
192              ning  with  '*',  '+', and '-', which simplifies interaction for
193              some programs.
194
195
196       --check-apostrophe
197              Check and force Unicode apostrophes  (U+2019),  if  one  of  the
198              ASCII  or  Unicode apostrophes is specified by the spelling dic‐
199              tionary, as a word character (see WORDCHARS, ICONV and OCONV  in
200              hunspell(5)).
201
202       --check-url
203              Check URLs, e-mail addresses and directory paths.
204
205
206       -D     Show  detected  path  of  the loaded dictionary, and list of the
207              search path and the available dictionaries.
208
209
210       -d dict,dict2,...
211              Set dictionaries by their base  names  with  or  without  paths.
212              Example of the syntax:
213
214       -d en_US,en_geo,en_med,de_DE,de_med
215
216       en_US and de_DE are base dictionaries, they consist of aff and dic file
217       pairs: en_US.aff, en_US.dic and de_DE.aff, de_DE.dic.  En_geo,  en_med,
218       de_med  are special dictionaries: dictionaries without affix file. Spe‐
219       cial dictionaries are optional extension of the base dictionaries  usu‐
220       ally  with  special (medical, law etc.)  terms. There is no naming con‐
221       vention for special dictionaries, only the ".dic" extension: dictionar‐
222       ies  without affix file will be an extension of the preceding base dic‐
223       tionary (right order of the parameter list needs for good suggestions).
224       First item of -d parameter list must be a base dictionary.
225
226
227       -G     Print only correct words or lines.
228
229
230       -H     The input file is in SGML/HTML format.
231
232
233       -h, --help
234              Short help.
235
236
237       -i enc Set input encoding.
238
239
240       -L     Print lines with misspelled words.
241
242
243       -l     The  "list" option is used to produce a list of misspelled words
244              from the standard input.
245
246
247       -m     Analyze the words of the input text (see also hunspell(5)  about
248              morphological  analysis). Without dictionary morphological data,
249              signs the flags of the affixes of the word forms for  dictionary
250              developers.
251
252
253       -n     The input file is in nroff/troff format.
254
255
256       -O     The  input file is in OpenDocument (ODF or Flat ODF) format.  If
257              unzip program is not installed, install  it  before  using  this
258              option.
259
260
261       -P password
262              Set password for encrypted dictionaries.
263
264
265       -p dict
266              Set path of personal dictionary.  The default dictionary depends
267              on the locale settings. The following environment variables  are
268              searched:  LC_ALL,  LC_MESSAGES,  and LANG. If none are set then
269              the default personal dictionary is $HOME/.hunspell_default.
270
271              Setting -d or  the DICTIONARY environmental  variable,  personal
272              dictionary will be $HOME/.hunspell_dicname
273
274
275       -r     Warn  of  the rare words, which are also potential spelling mis‐
276              takes.
277
278
279       -s     Stem the words of the input text  (see  also  hunspell(5)  about
280              stemming). It depends from the dictionary data.
281
282
283       -t     The input file is in TeX or LaTeX format.
284
285
286       -v, --version
287              Print version number.
288
289
290       -vv    Print ispell(1) compatible version number.
291
292
293       -w     Print misspelled words (= lines) from one word/line input.
294
295
296       -X     The input file is in XML format.
297
298

EXAMPLES

300       hunspell example.html
301              Interactive spell checking of an HTML file with the default dic‐
302              tionary.
303
304       hunspell -d en_US example.html
305              Interactive spell checking of an HTML file with the  en_US  dic‐
306              tionary.
307
308       hunspell -d en_US,en_US_med medical.txt
309              Interactive spell checking with multiple dictionaries.
310
311       hunspell *.odt
312              Interactive spell checking of ODF documents.
313
314       hunspell -l *.odt
315              List bad words of ODF documents
316
317       hunspell -l *.odt | sort | uniq >unrecognized
318              Saving  unrecognized  words of ODF documents (filtering duplica‐
319              tions).
320
321       hunspell -p unrecognized_but_good *.odt
322              Interactive spell checking of ODF documents,  using  the  previ‐
323              ously  saved and reduced word list, as a personal dictionary, to
324              speed up spell checking.
325
326
327       ENVIRONMENT
328
329       DICTIONARY
330              Similar to -d.
331
332       DICPATH
333              Dictionary path.
334
335       WORDLIST
336              Equivalent to -p.
337

FILES

339       The default dictionary depends on the locale  settings.  The  following
340       environment  variables  are searched: LC_ALL, LC_MESSAGES, and LANG. If
341       none are set then the following fallbacks are used:
342
343       /usr/share/myspell/default.aff Path of default  affix  file.  See  hun‐
344       spell(5).
345
346       /usr/share/myspell/default.dic  Path  of  default dictionary file.  See
347       hunspell(5).
348
349       $HOME/.hunspell_default.  Default path to personal dictionary.
350

AUTHOR

355       Author of Hunspell executable is László Németh. For  Hunspell  library,
356       see hunspell(3).
357
358       This manual based on Ispell's manual. See ispell(1).
359
360
361
362                                  2014-05-27                       hunspell(1)