1hunspell(3)                Library Functions Manual                hunspell(3)
2
3
4

NAME

6       hunspell  -  spell  checking,  stemming,  morphological  generation and
7       analysis
8

SYNOPSIS

10       #include <hunspell.hxx> /* or */
11       #include <hunspell.h>
12
13       Hunspell(const char *affpath, const char *dpath);
14
15       Hunspell(const char *affpath, const char *dpath, const char * key);
16
17       ~Hunspell();
18
19       int add_dic(const char *dpath);
20
21       int add_dic(const char *dpath, const char *key);
22
23       int spell(const char *word);
24
25       int spell(const char *word, int *info, char **root);
26
27       int suggest(char***slst, const char *word);
28
29       int analyze(char***slst, const char *word);
30
31       int stem(char***slst, const char *word);
32
33       int stem(char***slst, char **morph, int n);
34
35       int generate(char***slst, const char *word, const char *word2);
36
37       int generate(char***slst, const char *word, char **desc, int n);
38
39       void free_list(char ***slst, int n);
40
41       int add(const char *word);
42
43       int add_with_affix(const char *word, const char *example);
44
45       int remove(const char *word);
46
47       char * get_dic_encoding();
48
49       const char * get_wordchars();
50
51       unsigned short * get_wordchars_utf16(int *len);
52
53       struct cs_info * get_csconv();
54
55       const char * get_version();
56

DESCRIPTION

58       The Hunspell library  routines  give  the  user  word-level  linguistic
59       functions:  spell  checking  and  correction,  stemming,  morphological
60       generation and analysis in item-and-arrangement style.
61
62       The optional C header contains the C interface of the C++ library  with
63       Hunspell_create and Hunspell_destroy constructor and destructor, and an
64       extra  HunHandle  parameter  (the  allocated  object)  in  the  wrapper
65       functions (see in the C header file hunspell.h).
66
67       The  basic  spelling  functions,  spell() and suggest() can be used for
68       stemming, morphological generation and analysis by XML input texts (see
69       XML API).
70
71   Constructor and destructor
72       Hunspell's  constructor  needs paths of the affix and dictionary files.
73       (In WIN32 environment, use UTF-8 encoded paths started  with  the  long
74       path  prefix  \\?\  to handle system-independent character encoding and
75       very long path names, too.)  See the hunspell(4) manual  page  for  the
76       dictionary   format.    Optional  key  parameter  is  for  dictionaries
77       encrypted by the hzip tool of the Hunspell distribution.
78
79   Extra dictionaries
80       The add_dic() function  load  an  extra  dictionary  file.   The  extra
81       dictionaries  use  the  affix  file  of  the allocated Hunspell object.
82       Maximal number of the extra dictionaries is limited in the source  code
83       (20).
84
85   Spelling and correction
86       The  spell() function returns non-zero, if the input word is recognised
87       by the spell checker, and a  zero  value  if  not.  Optional  reference
88       variables  return  a  bit  array  (info) and the root word of the input
89       word.  Info bits checked with the  SPELL_COMPOUND,  SPELL_FORBIDDEN  or
90       SPELL_WARN  macros sign compound words, explicit forbidden and probably
91       bad words.  From version 1.3, the non-zero return value is  2  for  the
92       dictionary words with the flag "WARN" (probably bad words).
93
94       The  suggest()  function has two input parameters, a reference variable
95       of the output suggestion list, and an input word. The function  returns
96       the  number of the suggestions. The reference variable will contain the
97       address of the newly allocated suggestion list or NULL, if  the  return
98       value  of  suggest()  is  zero.  Maximal  number  of the suggestions is
99       limited in the source code.
100
101       The spell() and suggest() can recognize XML  input,  see  the  XML  API
102       section.
103
104   Morphological functions
105       The  plain stem() and analyze() functions are similar to the suggest(),
106       but  instead  of  suggestions,  return  stems  and   results   of   the
107       morphological  analysis. The plain generate() waits a second word, too.
108       This  extra  word  and  its  affixation  will  be  the  model  of   the
109       morphological generation of the requested forms of the first word.
110
111       The  extended  stem() and generate() use the results of a morphological
112       analysis:
113
114              char ** result, result2;
115              int n1 = analyze(&result, "words");
116              int n2 = stem(&result2, result, n1);
117
118       The morphological annotation of the Hunspell  library  has  fixed  (two
119       letter and a colon) field identifiers, see the hunspell(4) manual page.
120
121              char ** result;
122              char * affix = "is:plural"; // description depends from dictionaries, too
123              int n = generate(&result, "word", &affix, 1);
124              for (int i = 0; i < n; i++) printf("%s\n", result[i]);
125
126   Memory deallocation
127       The  free_list()  function  frees  the  memory  allocated by suggest(),
128       analyze, generate and stem() functions.
129
130   Other functions
131       The add(), add_with_affix() and remove()  are  helper  functions  of  a
132       personal  dictionary  implementation  to  add and remove words from the
133       base dictionary in run-time. The add_with_affix() uses  a  second  root
134       word  as the model of the enabled affixation and compounding of the new
135       word.
136
137       The get_dic_encoding() function returns "ISO8859-1"  or  the  character
138       encoding defined in the affix file with the "SET" keyword.
139
140       The get_csconv() function returns the 8-bit character case table of the
141       encoding of the dictionary.
142
143       The get_wordchars() and get_wordchars_utf16()  return  the  extra  word
144       characters  defined  in  affix file for tokenization by the "WORDCHARS"
145       keyword.
146
147       The get_version() returns the version string of the library.
148
149   XML API
150       The  spell()  function  returns  non-zero  for  the   "<?xml?>"   input
151       indicating the XML API support.
152
153       The  suggest()  function stems, analyzes and generates the forms of the
154       input word, if it was added by one of the following "SPELLML" syntaxes:
155
156              <?xml?>
157              <query type="analyze">
158              <word>dogs</word>
159              </query>
160
161              <?xml?>
162              <query type="stem">
163              <word>dogs</word>
164              </query>
165
166              <?xml?>
167              <query type="generate">
168              <word>dog</word>
169              <word>cats</word>
170              </query>
171
172              <?xml?>
173              <query type="generate">
174              <word>dog</word>
175              <code><a>is:pl</a><a>is:poss</a></code>
176              </query>
177
178              <?xml?>
179              <query type="add">
180              <word>word</word>
181              </query>
182
183              <?xml?>
184              <query type="add">
185              <word>word</word>
186              <word>model_word_for_affixation_and_compounding</word>
187              </query>
188
189       The outputs of the type="stem" query and the  stem()  library  function
190       are  the  same.  The  output  of  the  type="analyze" query is a string
191       contained a <code><a>result1</a><a>result2</a>...</code> element.  This
192       element can be used in the second syntax of the type="generate" query.
193

EXAMPLE

195       See analyze.cxx in the Hunspell distribution.
196

AUTHORS

198       Hunspell    based   on   Ispell's   spell   checking   algorithms   and
199       OpenOffice.org's Myspell source code.
200
201       Author of International Ispell is Geoff Kuenning.
202
203       Author of MySpell is Kevin Hendricks.
204
205       Author of Hunspell is László Németh.
206
207       Author of the original C API is Caolan McNamara.
208
209       Author of the Aspell table-driven phonetic transcription algorithm  and
210       code is Björn Jacke.
211
212       See also THANKS and Changelog files of Hunspell distribution.
213
214
215
216                                  2017-11-20                       hunspell(3)
Impressum