1WNSEARCH(3)               WordNet™ Library Functions               WNSEARCH(3)
2
3
4

NAME

6       findtheinfo,    findtheinfo_ds,    is_defined,   in_wn,   index_lookup,
7       parse_index,   getindex,    read_synset,    parse_synset,    free_syns,
8       free_synset, free_index, traceptrs_ds, do_trace - functions for search‐
9       ing the WordNet database
10

SYNOPSIS

12       #include "wn.h"
13
14       char  *findtheinfo(char  *searchstr,  int  pos,   int   ptr_type,   int
15       sense_num);
16
17       SynsetPtr  findtheinfo_ds(char  *searchstr,  int pos, int ptr_type, int
18       sense_num );
19
20       unsigned int is_defined(char *searchstr, int pos);
21
22       unsigned int in_wn(char *searchstr, int pos);
23
24       IndexPtr index_lookup(char *searchstr, int pos);
25
26       IndexPtr parse_index(long offset, int dabase, char *line);
27
28       IndexPtr getindex(char *searchstr, int pos);
29
30       SynsetPtr read_synset(int pos, long synset_offset, char *searchstr);
31
32       SynsetPtr parse_synset(FILE *fp, int pos, char *searchstr);
33
34       void free_syns(SynsetPtr synptr);
35
36       void free_synset(SynsetPtr synptr);
37
38       void free_index(IndexPtr idx);
39
40       SynsetPtr traceptrs_ds(SynsetPtr synptr, int  ptr_type,  int  pos,  int
41       depth);
42
43       char *do_trace(SynsetPtr synptr, int ptr_type, int pos, int depth);
44

DESCRIPTION

46       These functions are used for searching the WordNet database.  They gen‐
47       erally fall into several categories: functions for reading and  parsing
48       index  file  entries; functions for reading and parsing synsets in data
49       files; functions for tracing pointers and  hierarchies;  functions  for
50       freeing space occupied by data structures allocated with malloc(3).
51
52       In the following function descriptions, pos is one of the following:
53
54              1    NOUN
55              2    VERB
56              3    ADJECTIVE
57              4    ADVERB
58
59       findtheinfo()  is  the  primary  search algorithm for use with database
60       interface applications.  Search results  are  automatically  formatted,
61       and  a  pointer to the text buffer is returned.  All searches listed in
62       WNHOME/include/wn.h can be done by findtheinfo().  findtheinfo_ds() can
63       be  used  to  perform  most of the searches, with results returned in a
64       linked list data structure.  This is for  use  with  applications  that
65       need to analyze the search results rather than just display them.
66
67       Both  functions are passed the same arguments: searchstr is the word or
68       collocation to search for; pos  indicates  the  syntactic  category  to
69       search  in;  ptr_type is one of the valid search types for searchstr in
70       pos.  (Available searches  can  be  obtained  by  calling  is_defined()
71       described below.)  sense_num should be ALLSENSES if the search is to be
72       done on all senses of searchstr in pos, or a positive integer  indicat‐
73       ing which sense to search.
74
75       findtheinfo_ds()  returns  a  linked  list data structures representing
76       synsets.  Senses are linked through the nextss field of a  Synset  data
77       structure.   For  each  sense,  synsets that match the search specified
78       with ptr_type are linked through the ptrlist field.  See Synset Naviga‐
79       tion , below, for detailed information on the linked lists returned.
80
81       is_defined()  sets a bit for each search type that is valid for search‐
82       str in pos, and returns the resulting unsigned integer.  Each bit  num‐
83       ber    corresponds    to   a   pointer   type   constant   defined   in
84       WNHOME/include/wn.h.  For example, if bit 2 is set, the HYPERPTR search
85       is valid for searchstr.  There are 29 possible searches.
86
87       in_wn()  is  used to find the syntactic categories in the WordNet data‐
88       base that contain one or more senses of searchstr.  If pos is  ALL_POS,
89       all  syntactic  categories  are  checked.   Otherwise, only the part of
90       speech passed is checked.  An unsigned integer is returned with  a  bit
91       set corresponding to each syntactic category containing searchstr.  The
92       bit number matches the number for the part of speech.  0 is returned if
93       searchstr is not present in pos.
94
95       index_lookup()  finds searchstr in the index file for pos and returns a
96       pointer to the parsed entry in an Index data structure.  searchstr must
97       exactly match the form of the word (lower case only, hyphens and under‐
98       scores in the same places) in the index file.  NULL is  returned  if  a
99       match is not found.
100
101       parse_index()  parses an entry from an index file and returns a pointer
102       to the parsed entry in an Index data structure.  Passed the byte offset
103       and  syntactic  category, it reads the index entry at the desired loca‐
104       tion in the corresponding file.  If passed line, line contains an index
105       file entry and the database index file is not consulted.  However, off‐
106       set and dbase should still be passed so the information can  be  stored
107       in the Index structure.
108
109       getindex()  is  a "smart" search for searchstr in the index file corre‐
110       sponding to pos.  It applies to searchstr an  algorithm  that  replaces
111       underscores with hyphens, hyphens with underscores, removes hyphens and
112       underscores, and removes periods in an attempt to find a  form  of  the
113       string  that  is  an  exact match for an entry in the index file corre‐
114       sponding to pos.  index_lookup() is called on each  transformed  string
115       until  a  match  is found or all the different strings have been tried.
116       It returns a pointer to the parsed Index data structure for  searchstr,
117       or NULL if a match is not found.
118
119       read_synset()  is  used  to  read a synset from a byte offset in a data
120       file.  It performs an fseek(3) to synset_offset in the data file corre‐
121       sponding to pos, and calls parse_synset() to read and parse the synset.
122       A pointer to the Synset data structure containing the parsed synset  is
123       returned.
124
125       parse_synset() reads the synset at the current offset in the file indi‐
126       cated by fp.  pos is the syntactic  category,  and  searchstr,  if  not
127       NULL,  indicates  the  word in the synset that the caller is interested
128       in.  An attempt is made to match searchstr to one of the words  in  the
129       synset.   If an exact match is found, the whichword field in the Synset
130       structure is set to that word's number  in  the  synset  (beginning  to
131       count from 1).
132
133       free_syns()  is  used  to free a linked list of Synset structures allo‐
134       cated by findtheinfo_ds().  synptr is a pointer to the list to free.
135
136       free_synset() frees the Synset structure pointed to by synptr.
137
138       free_index() frees the Index structure pointed to by idx.
139
140       traceptrs_ds() is a recursive search  algorithm  that  traces  pointers
141       matching  ptr_type starting with the synset pointed to by synptr.  Set‐
142       ting depth to 1 when traceptrs_ds() is  called  indicates  a  recursive
143       search;  0  indicates  a non-recursive call.  synptr points to the data
144       structure representing the synset to  search  for  a  pointer  of  type
145       ptr_type.  When a pointer type match is found, the synset pointed to is
146       read is linked onto the nextss chain.  Levels of the tree generated  by
147       a  recursive  search  are  linked via the ptrlist field structure until
148       NULL is found, indicating the top (or bottom) of the tree.  This  func‐
149       tion  is  usually  called  from  findtheinfo_ds() for each sense of the
150       word.  See Synset Navigation , below, for detailed information  on  the
151       linked lists returned.
152
153       do_trace()  performs  the search indicated by ptr_type on synset synptr
154       in syntactic category pos.  depth  is  defined  as  above.   do_trace()
155       returns the search results formatted in a text buffer.
156
157   Synset Navigation
158       Since  the  Synset  structure is used to represent the synsets for both
159       word senses and pointers, the ptrlist and nextss fields have  different
160       meanings depending on whether the structure is a word sense or pointer.
161       This can make navigation through the lists returned by findtheinfo_ds()
162       confusing.
163
164       Navigation through the returned list involves the following:
165
166       Following  the  nextss chain from the synset returned moves through the
167       various senses of searchstr.  NULL indicates that end of the  chain  of
168       senses.
169
170       Following  the  ptrlist  chain  from  a Synset structure representing a
171       sense traces the hierarchy of the search results for that sense.   Sub‐
172       sequent links in the ptrlist chain indicate the next level (up or down,
173       depending on the search) in the hierarchy.  NULL indicates the  end  of
174       the chain of search result synsets.
175
176       If  a  synset pointed to by ptrlist has a value in the nextss field, it
177       represents another pointer of the same type at that level in the  hier‐
178       archy.   For  example, some noun synsets have two hypernyms.  Following
179       this nextss pointer, and then the ptrlist chain from the Synset  struc‐
180       ture  pointed to, traces another, parallel, hierarchy, until the end is
181       indicated by NULL on that ptrlist chain.  So, a synset  representing  a
182       pointer (versus a sense of searchstr) having a non-NULL value in nextss
183       has another chain of search results linked through the ptrlist chain of
184       the synset pointed to by nextss.
185
186       If  searchstr  contains  more  than one base form in WordNet (as in the
187       noun axes, which has base forms axe and axis), synsets representing the
188       search  results  for  each  base  form  are linked through the nextform
189       pointer of the Synset structure.
190
191   WordNet Searches
192       There is no extensive description of what each search type  is  or  the
193       results  returned.   Using  the WordNet interface, examining the source
194       code, and reading wndb(5) are the  best  ways  to  see  what  types  of
195       searches are available and the data returned for each.
196
197       Listed  below  are the valid searches that can be passed as ptr_type to
198       findtheinfo().  Passing a negative value  (when  applicable)  causes  a
199       recursive,  hierarchical  search by setting depth to 1 when traceptrs()
200       is called.
201
202
203  ┌─────────────────┬───────┬─────────┬────────────────────────────────────────────┐
204ptr_type         Value Pointer Search                                     
205  │                 │       │ Symbol  │                                            │
206  ├─────────────────┼───────┼─────────┼────────────────────────────────────────────┤
207  │ANTPTR           │   1   │    !    │ Antonyms                                   │
208  │HYPERPTR         │   2   │    @    │ Hypernyms                                  │
209  │HYPOPTR          │   3   │    ∼    │ Hyponyms                                   │
210  │ENTAILPTR        │   4   │    *    │ Entailment                                 │
211  │SIMPTR           │   5   │    &    │ Similar                                    │
212  │ISMEMBERPTR      │   6   │   #m    │ Member meronym                             │
213  │ISSTUFFPTR       │   7   │   #s    │ Substance meronym                          │
214  │ISPARTPTR        │   8   │   #p    │ Part meronym                               │
215  │HASMEMBERPTR     │   9   │   %m    │ Member holonym                             │
216  │HASSTUFFPTR      │  10   │   %s    │ Substance holonym                          │
217  │HASPARTPTR       │  11   │   %p    │ Part holonym                               │
218  │MERONYM          │  12   │    %    │ All meronyms                               │
219  │HOLONYM          │  13   │    #    │ All holonyms                               │
220  │CAUSETO          │  14   │    >    │ Cause                                      │
221  │PPLPTR           │  15   │    <    │ Participle of verb                         │
222  │SEEALSOPTR       │  16   │    ^    │ Also see                                   │
223  │PERTPTR          │  17   │    \    │ Pertains to noun or derived from adjective │
224  │ATTRIBUTE        │  18   │   \=    │ Attribute                                  │
225  │VERBGROUP        │  19   │    $    │ Verb group                                 │
226  │DERIVATION       │  20   │    +    │ Derivationally related form                │
227  │CLASSIFICATION   │  21   │    ;    │ Domain of synset                           │
228  │CLASS            │  22   │    -    │ Member of this domain                      │
229  │SYNS             │  23   │   n/a   │ Find synonyms                              │
230  │FREQ             │  24   │   n/a   │ Polysemy                                   │
231  │FRAMES           │  25   │   n/a   │ Verb example sentences and generic frames  │
232  │COORDS           │  26   │   n/a   │ Noun coordinates                           │
233  │RELATIVES        │  27   │   n/a   │ Group related senses                       │
234  │HMERONYM         │  28   │   n/a   │ Hierarchical meronym search                │
235  │HHOLONYM         │  29   │   n/a   │ Hierarchical holonym search                │
236  │WNGREP           │  30   │   n/a   │ Find keywords by substring                 │
237  │OVERVIEW         │  31   │   n/a   │ Show all synsets for word                  │
238  │CLASSIF_CATEGORY │  32   │   ;c    │ Show domain topic                          │
239  │CLASSIF_USAGE    │  33   │   ;u    │ Show domain usage                          │
240  │CLASSIF_REGIONAL │  34   │   ;r    │ Show domain region                         │
241  │CLASS_CATEGORY   │  35   │   -c    │ Show domain terms for topic                │
242  │CLASS_USAGE      │  36   │   -u    │ Show domain terms for usage                │
243  │CLASS_REGIONAL   │  37   │   -r    │ Show domain terms for region               │
244  │INSTANCE         │  38   │   @i    │ Instance of                                │
245  │INSTANCES        │  39   │   ∼i    │ Show instances                             │
246  └─────────────────┴───────┴─────────┴────────────────────────────────────────────┘
247       findtheinfo_ds() cannot perform the following searches:
248
249              SEEALSOPTR
250              PERTPTR
251              VERBGROUP
252              FREQ
253              FRAMES
254              RELATIVES
255              WNGREP
256              OVERVIEW
257

NOTES

259       Applications that use WordNet and/or the morphological  functions  must
260       call  wninit()  at  the  start  of the program.  See wnutil(3) for more
261       information.
262
263       In all function calls, searchstr may be either a word or a  collocation
264       formed by joining individual words with underscore characters (_).
265
266       The  SearchResults  structure  defines  fields  in the wnresults global
267       variable that are set by the various search functions.  This is  a  way
268       to  get  additional  information, such as the number of senses the word
269       has, from the search functions.  The searchds field is set by  findthe‐
270       info_ds().
271
272       The pos passed to traceptrs_ds() is not used.
273
274

SEE ALSO

276       wn(1),  wnb(1), wnintro(3), binsrch(3), malloc(3), morph(3), wnutil(3),
277       wnintro(5).
278

WARNINGS

280       parse_synset() must find an exact match between  the  searchstr  passed
281       and  a  word  in  the  synset  to set whichword.  No attempt is made to
282       translate hyphens and underscores, as is done in getindex().
283
284       The WordNet database and exception  list  files  must  be  opened  with
285       wninit prior to using any of the searching functions.
286
287       A large search may cause findtheinfo() to run out of buffer space.  The
288       maximum buffer size is determined by computer platform.  If the  buffer
289       size is exceeded the following message is printed in the output buffer:
290       "Search too large.  Narrow search and try again...".
291
292       Passing an invalid pos will probably result in a core dump.
293
294
295
296WordNet 3.0                        Dec 2006                        WNSEARCH(3)
Impressum