1hunspell(3) Library Functions Manual hunspell(3)
2
3
4
6 hunspell - spell checking, stemming, morphological generation and
7 analysis
8
10 #include <hunspell.hxx> /* or */
11 #include <hunspell.h>
12
13 Hunspell(const char *affpath, const char *dpath);
14
15 Hunspell(const char *affpath, const char *dpath, const char * key);
16
17 ~Hunspell();
18
19 int add_dic(const char *dpath);
20
21 int add_dic(const char *dpath, const char *key);
22
23 int spell(const char *word);
24
25 int spell(const char *word, int *info, char **root);
26
27 int suggest(char***slst, const char *word);
28
29 int analyze(char***slst, const char *word);
30
31 int stem(char***slst, const char *word);
32
33 int stem(char***slst, char **morph, int n);
34
35 int generate(char***slst, const char *word, const char *word2);
36
37 int generate(char***slst, const char *word, char **desc, int n);
38
39 void free_list(char ***slst, int n);
40
41 int add(const char *word);
42
43 int add_with_affix(const char *word, const char *example);
44
45 int remove(const char *word);
46
47 char * get_dic_encoding();
48
49 const char * get_wordchars();
50
51 unsigned short * get_wordchars_utf16(int *len);
52
53 struct cs_info * get_csconv();
54
55 const char * get_version();
56
58 The Hunspell library routines give the user word-level linguistic
59 functions: spell checking and correction, stemming, morphological
60 generation and analysis in item-and-arrangement style.
61
62 The optional C header contains the C interface of the C++ library with
63 Hunspell_create and Hunspell_destroy constructor and destructor, and an
64 extra HunHandle parameter (the allocated object) in the wrapper
65 functions (see in the C header file hunspell.h).
66
67 The basic spelling functions, spell() and suggest() can be used for
68 stemming, morphological generation and analysis by XML input texts (see
69 XML API).
70
71 Constructor and destructor
72 Hunspell's constructor needs paths of the affix and dictionary files.
73 (In WIN32 environment, use UTF-8 encoded paths started with the long
74 path prefix \\?\ to handle system-independent character encoding and
75 very long path names, too.) See the hunspell(4) manual page for the
76 dictionary format. Optional key parameter is for dictionaries
77 encrypted by the hzip tool of the Hunspell distribution.
78
79 Extra dictionaries
80 The add_dic() function load an extra dictionary file. The extra
81 dictionaries use the affix file of the allocated Hunspell object.
82 Maximal number of the extra dictionaries is limited in the source code
83 (20).
84
85 Spelling and correction
86 The spell() function returns non-zero, if the input word is recognised
87 by the spell checker, and a zero value if not. Optional reference
88 variables return a bit array (info) and the root word of the input
89 word. Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or
90 SPELL_WARN macros sign compound words, explicit forbidden and probably
91 bad words. From version 1.3, the non-zero return value is 2 for the
92 dictionary words with the flag "WARN" (probably bad words).
93
94 The suggest() function has two input parameters, a reference variable
95 of the output suggestion list, and an input word. The function returns
96 the number of the suggestions. The reference variable will contain the
97 address of the newly allocated suggestion list or NULL, if the return
98 value of suggest() is zero. Maximal number of the suggestions is
99 limited in the source code.
100
101 The spell() and suggest() can recognize XML input, see the XML API
102 section.
103
104 Morphological functions
105 The plain stem() and analyze() functions are similar to the suggest(),
106 but instead of suggestions, return stems and results of the
107 morphological analysis. The plain generate() waits a second word, too.
108 This extra word and its affixation will be the model of the
109 morphological generation of the requested forms of the first word.
110
111 The extended stem() and generate() use the results of a morphological
112 analysis:
113
114 char ** result, result2;
115 int n1 = analyze(&result, "words");
116 int n2 = stem(&result2, result, n1);
117
118 The morphological annotation of the Hunspell library has fixed (two
119 letter and a colon) field identifiers, see the hunspell(4) manual page.
120
121 char ** result;
122 char * affix = "is:plural"; // description depends from dictionaries, too
123 int n = generate(&result, "word", &affix, 1);
124 for (int i = 0; i < n; i++) printf("%s\n", result[i]);
125
126 Memory deallocation
127 The free_list() function frees the memory allocated by suggest(),
128 analyze, generate and stem() functions.
129
130 Other functions
131 The add(), add_with_affix() and remove() are helper functions of a
132 personal dictionary implementation to add and remove words from the
133 base dictionary in run-time. The add_with_affix() uses a second word as
134 a model of the enabled affixation of the new word.
135
136 The get_dic_encoding() function returns "ISO8859-1" or the character
137 encoding defined in the affix file with the "SET" keyword.
138
139 The get_csconv() function returns the 8-bit character case table of the
140 encoding of the dictionary.
141
142 The get_wordchars() and get_wordchars_utf16() return the extra word
143 characters defined in affix file for tokenization by the "WORDCHARS"
144 keyword.
145
146 The get_version() returns the version string of the library.
147
148 XML API
149 The spell() function returns non-zero for the "<?xml?>" input
150 indicating the XML API support.
151
152 The suggest() function stems, analyzes and generates the forms of the
153 input word, if it was added by one of the following "SPELLML" syntaxes:
154
155 <?xml?>
156 <query type="analyze">
157 <word>dogs</word>
158 </query>
159
160 <?xml?>
161 <query type="stem">
162 <word>dogs</word>
163 </query>
164
165 <?xml?>
166 <query type="generate">
167 <word>dog</word>
168 <word>cats</word>
169 </query>
170
171 <?xml?>
172 <query type="generate">
173 <word>dog</word>
174 <code><a>is:pl</a><a>is:poss</a></code>
175 </query>
176
177 The outputs of the type="stem" query and the stem() library function
178 are the same. The output of the type="analyze" query is a string
179 contained a <code><a>result1</a><a>result2</a>...</code> element. This
180 element can be used in the second syntax of the type="generate" query.
181
183 See analyze.cxx in the Hunspell distribution.
184
186 Hunspell based on Ispell's spell checking algorithms and
187 OpenOffice.org's Myspell source code.
188
189 Author of International Ispell is Geoff Kuenning.
190
191 Author of MySpell is Kevin Hendricks.
192
193 Author of Hunspell is László Németh.
194
195 Author of the original C API is Caolan McNamara.
196
197 Author of the Aspell table-driven phonetic transcription algorithm and
198 code is Björn Jacke.
199
200 See also THANKS and Changelog files of Hunspell distribution.
201
202
203
204 2014-05-26 hunspell(3)