1hunspell(3) Library Functions Manual hunspell(3)
2
3
4
6 hunspell - spell checking, stemming, morphological generation and
7 analysis
8
10 #include <hunspell.hxx> /* or */
11 #include <hunspell.h>
12
13 Hunspell(const char *affpath, const char *dpath);
14
15 Hunspell(const char *affpath, const char *dpath, const char * key);
16
17 ~Hunspell();
18
19 int add_dic(const char *dpath);
20
21 int add_dic(const char *dpath, const char *key);
22
23 int spell(const char *word);
24
25 int spell(const char *word, int *info, char **root);
26
27 int suggest(char***slst, const char *word);
28
29 int analyze(char***slst, const char *word);
30
31 int stem(char***slst, const char *word);
32
33 int stem(char***slst, char **morph, int n);
34
35 int generate(char***slst, const char *word, const char *word2);
36
37 int generate(char***slst, const char *word, char **desc, int n);
38
39 void free_list(char ***slst, int n);
40
41 int add(const char *word);
42
43 int add_with_affix(const char *word, const char *example);
44
45 int remove(const char *word);
46
47 char * get_dic_encoding();
48
49 const char * get_wordchars();
50
51 unsigned short * get_wordchars_utf16(int *len);
52
53 struct cs_info * get_csconv();
54
55 const char * get_version();
56
58 The Hunspell library routines give the user word-level linguistic
59 functions: spell checking and correction, stemming, morphological
60 generation and analysis in item-and-arrangement style.
61
62 The optional C header contains the C interface of the C++ library with
63 Hunspell_create and Hunspell_destroy constructor and destructor, and an
64 extra HunHandle parameter (the allocated object) in the wrapper
65 functions (see in the C header file hunspell.h).
66
67 The basic spelling functions, spell() and suggest() can be used for
68 stemming, morphological generation and analysis by XML input texts (see
69 XML API).
70
71 Constructor and destructor
72 Hunspell's constructor needs paths of the affix and dictionary files.
73 (In WIN32 environment, use UTF-8 encoded paths started with the long
74 path prefix \\?\ to handle system-independent character encoding and
75 very long path names, too.) See the hunspell(4) manual page for the
76 dictionary format. Optional key parameter is for dictionaries
77 encrypted by the hzip tool of the Hunspell distribution.
78
79 Extra dictionaries
80 The add_dic() function load an extra dictionary file. The extra
81 dictionaries use the affix file of the allocated Hunspell object.
82 Maximal number of the extra dictionaries is limited in the source code
83 (20).
84
85 Spelling and correction
86 The spell() function returns non-zero, if the input word is recognised
87 by the spell checker, and a zero value if not. Optional reference
88 variables return a bit array (info) and the root word of the input
89 word. Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or
90 SPELL_WARN macros sign compound words, explicit forbidden and probably
91 bad words. From version 1.3, the non-zero return value is 2 for the
92 dictionary words with the flag "WARN" (probably bad words).
93
94 The suggest() function has two input parameters, a reference variable
95 of the output suggestion list, and an input word. The function returns
96 the number of the suggestions. The reference variable will contain the
97 address of the newly allocated suggestion list or NULL, if the return
98 value of suggest() is zero. Maximal number of the suggestions is
99 limited in the source code.
100
101 The spell() and suggest() can recognize XML input, see the XML API
102 section.
103
104 Morphological functions
105 The plain stem() and analyze() functions are similar to the suggest(),
106 but instead of suggestions, return stems and results of the
107 morphological analysis. The plain generate() waits a second word, too.
108 This extra word and its affixation will be the model of the
109 morphological generation of the requested forms of the first word.
110
111 The extended stem() and generate() use the results of a morphological
112 analysis:
113
114 char ** result, result2;
115 int n1 = analyze(&result, "words");
116 int n2 = stem(&result2, result, n1);
117
118 The morphological annotation of the Hunspell library has fixed (two
119 letter and a colon) field identifiers, see the hunspell(4) manual page.
120
121 char ** result;
122 char * affix = "is:plural"; // description depends from dictionaries, too
123 int n = generate(&result, "word", &affix, 1);
124 for (int i = 0; i < n; i++) printf("%s\n", result[i]);
125
126 Memory deallocation
127 The free_list() function frees the memory allocated by suggest(),
128 analyze, generate and stem() functions.
129
130 Other functions
131 The add(), add_with_affix() and remove() are helper functions of a
132 personal dictionary implementation to add and remove words from the
133 base dictionary in run-time. The add_with_affix() uses a second root
134 word as the model of the enabled affixation and compounding of the new
135 word.
136
137 The get_dic_encoding() function returns "ISO8859-1" or the character
138 encoding defined in the affix file with the "SET" keyword.
139
140 The get_csconv() function returns the 8-bit character case table of the
141 encoding of the dictionary.
142
143 The get_wordchars() and get_wordchars_utf16() return the extra word
144 characters defined in affix file for tokenization by the "WORDCHARS"
145 keyword.
146
147 The get_version() returns the version string of the library.
148
149 XML API
150 The spell() function returns non-zero for the "<?xml?>" input
151 indicating the XML API support.
152
153 The suggest() function stems, analyzes and generates the forms of the
154 input word, if it was added by one of the following "SPELLML" syntaxes:
155
156 <?xml?>
157 <query type="analyze">
158 <word>dogs</word>
159 </query>
160
161 <?xml?>
162 <query type="stem">
163 <word>dogs</word>
164 </query>
165
166 <?xml?>
167 <query type="generate">
168 <word>dog</word>
169 <word>cats</word>
170 </query>
171
172 <?xml?>
173 <query type="generate">
174 <word>dog</word>
175 <code><a>is:pl</a><a>is:poss</a></code>
176 </query>
177
178 <?xml?>
179 <query type="add">
180 <word>word</word>
181 </query>
182
183 <?xml?>
184 <query type="add">
185 <word>word</word>
186 <word>model_word_for_affixation_and_compounding</word>
187 </query>
188
189 The outputs of the type="stem" query and the stem() library function
190 are the same. The output of the type="analyze" query is a string
191 contained a <code><a>result1</a><a>result2</a>...</code> element. This
192 element can be used in the second syntax of the type="generate" query.
193
195 See analyze.cxx in the Hunspell distribution.
196
198 Hunspell based on Ispell's spell checking algorithms and
199 OpenOffice.org's Myspell source code.
200
201 Author of International Ispell is Geoff Kuenning.
202
203 Author of MySpell is Kevin Hendricks.
204
205 Author of Hunspell is László Németh.
206
207 Author of the original C API is Caolan McNamara.
208
209 Author of the Aspell table-driven phonetic transcription algorithm and
210 code is Björn Jacke.
211
212 See also THANKS and Changelog files of Hunspell distribution.
213
214
215
216 2017-11-20 hunspell(3)