1hunspell(3) Library Functions Manual hunspell(3)
2
3
4
6 hunspell - spell checking, stemming, morphological generation and
7 analysis
8
10 #include <hunspell/hunspell.hxx> /* or */
11 #include <hunspell/hunspell.h>
12
13 Hunspell(const char *affpath, const char *dpath);
14
15 Hunspell(const char *affpath, const char *dpath, const char * key);
16
17 ~Hunspell();
18
19 int add_dic(const char *dpath);
20
21 int add_dic(const char *dpath, const char *key);
22
23 int spell(const char *word);
24
25 int spell(const char *word, int *info, char **root);
26
27 int suggest(char***slst, const char *word);
28
29 int analyze(char***slst, const char *word);
30
31 int stem(char***slst, const char *word);
32
33 int stem(char***slst, char **morph, int n);
34
35 int generate(char***slst, const char *word, const char *word2);
36
37 int generate(char***slst, const char *word, char **desc, int n);
38
39 void free_list(char ***slst, int n);
40
41 int add(const char *word);
42
43 int add_with_affix(const char *word, const char *example);
44
45 int remove(const char *word);
46
47 char * get_dic_encoding();
48
49 const char * get_wordchars();
50
51 unsigned short * get_wordchars_utf16(int *len);
52
53 struct cs_info * get_csconv();
54
55 const char * get_version();
56
58 The Hunspell library routines give the user word-level linguistic
59 functions: spell checking and correction, stemming, morphological
60 generation and analysis in item-and-arrangement style.
61
62 The optional C header contains the C interface of the C++ library with
63 Hunspell_create and Hunspell_destroy constructor and destructor, and an
64 extra HunHandle parameter (the allocated object) in the wrapper
65 functions (see in the C header file hunspell.h).
66
67 The basic spelling functions, spell() and suggest() can be used for
68 stemming, morphological generation and analysis by XML input texts (see
69 XML API).
70
71 Constructor and destructor
72 Hunspell's constructor needs paths of the affix and dictionary files.
73 See the hunspell(4) manual page for the dictionary format. Optional
74 key parameter is for dictionaries encrypted by the hzip tool of the
75 Hunspell distribution.
76
77 Extra dictionaries
78 The add_dic() function load an extra dictionary file. The extra
79 dictionaries use the affix file of the allocated Hunspell object.
80 Maximal number of the extra dictionaries is limited in the source code
81 (20).
82
83 Spelling and correction
84 The spell() function returns non-zero, if the input word is recognised
85 by the spell checker, and a zero value if not. Optional reference
86 variables return a bit array (info) and the root word of the input
87 word. Info bits checked with the SPELL_COMPOUND, SPELL_FORBIDDEN or
88 SPELL_WARN macros sign compound words, explicit forbidden and probably
89 bad words. From version 1.3, the non-zero return value is 2 for the
90 dictionary words with the flag "WARN" (probably bad words).
91
92 The suggest() function has two input parameters, a reference variable
93 of the output suggestion list, and an input word. The function returns
94 the number of the suggestions. The reference variable will contain the
95 address of the newly allocated suggestion list or NULL, if the return
96 value of suggest() is zero. Maximal number of the suggestions is
97 limited in the source code.
98
99 The spell() and suggest() can recognize XML input, see the XML API
100 section.
101
102 Morphological functions
103 The plain stem() and analyze() functions are similar to the suggest(),
104 but instead of suggestions, return stems and results of the
105 morphological analysis. The plain generate() waits a second word, too.
106 This extra word and its affixation will be the model of the
107 morphological generation of the requested forms of the first word.
108
109 The extended stem() and generate() use the results of a morphological
110 analysis:
111
112 char ** result, result2;
113 int n1 = analyze(&result, "words");
114 int n2 = stem(&result2, result, n1);
115
116 The morphological annotation of the Hunspell library has fixed (two
117 letter and a colon) field identifiers, see the hunspell(4) manual page.
118
119 char ** result;
120 char * affix = "is:plural"; // description depends from dictionaries, too
121 int n = generate(&result, "word", &affix, 1);
122 for (int i = 0; i < n; i++) printf("%s0, result[i]);
123
124 Memory deallocation
125 The free_list() function frees the memory allocated by suggest(),
126 analyze, generate and stem() functions.
127
128 Other functions
129 The add(), add_with_affix() and remove() are helper functions of a
130 personal dictionary implementation to add and remove words from the
131 base dictionary in run-time. The add_with_affix() uses a second word as
132 a model of the enabled affixation of the new word.
133
134 The get_dic_encoding() function returns "ISO8859-1" or the character
135 encoding defined in the affix file with the "SET" keyword.
136
137 The get_csconv() function returns the 8-bit character case table of the
138 encoding of the dictionary.
139
140 The get_wordchars() and get_wordchars_utf16() return the extra word
141 characters definied in affix file for tokenization by the "WORDCHARS"
142 keyword.
143
144 The get_version() returns the version string of the library.
145
146 XML API
147 The spell() function returns non-zero for the "<?xml?>" input
148 indicating the XML API support.
149
150 The suggest() function stems, analyzes and generates the forms of the
151 input word, if it was added by one of the following "SPELLML" syntaxes:
152
153 <?xml?>
154 <query type="analyze">
155 <word>dogs</word>
156 </query>
157
158 <?xml?>
159 <query type="stem">
160 <word>dogs</word>
161 </query>
162
163 <?xml?>
164 <query type="generate">
165 <word>dog</word>
166 <word>cats</word>
167 </query>
168
169 <?xml?>
170 <query type="generate">
171 <word>dog</word>
172 <code><a>is:pl</a><a>is:poss</a></code>
173 </query>
174
175 The outputs of the type="stem" query and the stem() library function
176 are the same. The output of the type="analyze" query is a string
177 contained a <code><a>result1</a><a>result2</a>...</code> element. This
178 element can be used in the second syntax of the type="generate" query.
179
181 See analyze.cxx in the Hunspell distribution.
182
184 Hunspell based on Ispell's spell checking algorithms and
185 OpenOffice.org's Myspell source code.
186
187 Author of International Ispell is Geoff Kuenning.
188
189 Author of MySpell is Kevin Hendricks.
190
191 Author of Hunspell is László Németh.
192
193 Author of the original C API is Caolan McNamara.
194
195 Author of the Aspell table-driven phonetic transcription algorithm and
196 code is Björn Jacke.
197
198 See also THANKS and Changelog files of Hunspell distribution.
199
200
201
202 2011-02-01 hunspell(3)