1hunspell(3) Library Functions Manual hunspell(3)
2
3
4
6 hunspell - spell checking, stemming, morphological generation and
7 analysis
8
10 #include <hunspell/hunspell.hxx> /* or */
11 #include <hunspell/hunspell.h>
12
13 Hunspell(const char *affpath, const char *dpath);
14
15 Hunspell(const char *affpath, const char *dpath, const char * key);
16
17 ~Hunspell();
18
19 int add_dic(const char *dpath);
20
21 int add_dic(const char *dpath, const char *key);
22
23 int spell(const char *word);
24
25 int spell(const char *word, int *info, char **root);
26
27 int suggest(char***slst, const char *word);
28
29 int analyze(char***slst, const char *word);
30
31 int stem(char***slst, const char *word);
32
33 int stem(char***slst, char **morph, int n);
34
35 int generate(char***slst, const char *word, const char *word2);
36
37 int generate(char***slst, const char *word, char **desc, int n);
38
39 void free_list(char ***slst, int n);
40
41 int add(const char *word);
42
43 int add_with_affix(const char *word, const char *example);
44
45 int remove(const char *word);
46
47 char * get_dic_encoding();
48
49 const char * get_wordchars();
50
51 unsigned short * get_wordchars_utf16(int *len);
52
53 struct cs_info * get_csconv();
54
55 const char * get_version();
56
58 The Hunspell library routines give the user word-level linguistic
59 functions: spell checking and correction, stemming, morphological
60 generation and analysis in item-and-arrangement style.
61
62 The optional C header contains the C interface of the C++ library with
63 Hunspell_create and Hunspell_destroy constructor and destructor, and an
64 extra HunHandle parameter (the allocated object) in the wrapper
65 functions (see in the C header file hunspell.h).
66
67 The basic spelling functions, spell() and suggest() can be used for
68 stemming, morphological generation and analysis by XML input texts (see
69 XML API).
70
71 Constructor and destructor
72 Hunspell's constructor needs paths of the affix and dictionary files.
73 See the hunspell(4) manual page for the dictionary format. Optional
74 key parameter is for dictionaries encrypted by the hzip tool of the
75 Hunspell distribution.
76
77 Extra dictionaries
78 The add_dic() function load an extra dictionary file. The extra
79 dictionaries use the affix file of the allocated Hunspell object.
80 Maximal number of the extra dictionaries is limited in the source code
81 (20).
82
83 Spelling and correction
84 The spell() function returns non-zero, if the input word is recognised
85 by the spell checker, and a zero value if not. Optional reference
86 variables return a bit array (info) and the root word of the input
87 word. Info bits checked with the SPELL_COMPOUND and SPELL_FORBIDDEN
88 macros sign compound words and explicit forbidden words.
89
90 The suggest() function has two input parameters, a reference variable
91 of the output suggestion list, and an input word. The function returns
92 the number of the suggestions. The reference variable will contain the
93 address of the newly allocated suggestion list or NULL, if the return
94 value of suggest() is zero. Maximal number of the suggestions is
95 limited in the source code.
96
97 The spell() and suggest() can recognize XML input, see the XML API
98 section.
99
100 Morphological functions
101 The plain stem() and analyze() functions are similar to the suggest(),
102 but instead of suggestions, return stems and results of the
103 morphological analysis. The plain generate() waits a second word, too.
104 This extra word and its affixation will be the model of the
105 morphological generation of the requested forms of the first word.
106
107 The extended stem() and generate() use the results of a morphological
108 analysis:
109
110 char ** result, result2;
111 int n1 = analyze(&result, "words");
112 int n2 = stem(&result2, result, n1);
113
114 The morphological annotation of the Hunspell library has fixed (two
115 letter and a colon) field identifiers, see the hunspell(4) manual page.
116
117 char ** result;
118 char * affix = "is:plural"; // description depends from dictionaries, too
119 int n = generate(&result, "word", &affix, 1);
120 for (int i = 0; i < n; i++) printf("%s0, result[i]);
121
122 Memory deallocation
123 The free_list() function frees the memory allocated by suggest(),
124 analyze, generate and stem() functions.
125
126 Other functions
127 The add(), add_with_affix() and remove() are helper functions of a
128 personal dictionary implementation to add and remove words from the
129 base dictionary in run-time. The add_with_affix() uses a second word as
130 a model of the enabled affixation of the new word.
131
132 The get_dic_encoding() function returns "ISO8859-1" or the character
133 encoding defined in the affix file with the "SET" keyword.
134
135 The get_csconv() function returns the 8-bit character case table of the
136 encoding of the dictionary.
137
138 The get_wordchars() and get_wordchars_utf16() return the extra word
139 characters definied in affix file for tokenization by the "WORDCHARS"
140 keyword.
141
142 The get_version() returns the version string of the library.
143
144 XML API
145 The spell() function returns non-zero for the "<?xml?>" input
146 indicating the XML API support.
147
148 The suggest() function stems, analyzes and generates the forms of the
149 input word, if it was added by one of the following "SPELLML" syntaxes:
150
151 <?xml?>
152 <query type="analyze">
153 <word>dogs</word>
154 </query>
155
156 <?xml?>
157 <query type="stem">
158 <word>dogs</word>
159 </query>
160
161 <?xml?>
162 <query type="generate">
163 <word>dog</word>
164 <word>cats</word>
165 </query>
166
167 <?xml?>
168 <query type="generate">
169 <word>dog</word>
170 <code><a>is:pl</a><a>is:poss</a></code>
171 </query>
172
173 The outputs of the type="stem" query and the stem() library function
174 are the same. The output of the type="analyze" query is a string
175 contained a <code><a>result1</a><a>result2</a>...</code> element. This
176 element can be used in the second syntax of the type="generate" query.
177
179 See analyze.cxx in the Hunspell distribution.
180
182 Hunspell based on Ispell's spell checking algorithms and
183 OpenOffice.org's Myspell source code.
184
185 Author of International Ispell is Geoff Kuenning.
186
187 Author of MySpell is Kevin Hendricks.
188
189 Author of Hunspell is László Németh.
190
191 Author of the original C API is Caolan McNamara.
192
193 Author of the Aspell table-driven phonetic transcription algorithm and
194 code is Björn Jacke.
195
196 See also THANKS and Changelog files of Hunspell distribution.
197
198
199
200 2008-06-17 hunspell(3)