1MORPHY(7)                          WordNet™                          MORPHY(7)
2
3
4

NAME

6       morphy - discussion of WordNet's morphological processing
7

DESCRIPTION

9       Although  only  base  forms  of  words  are  usually stored in WordNet,
10       searches may be done on inflected forms.  A  set  of  morphology  func‐
11       tions,  Morphy, is applied to the search string to generate a form that
12       is present in WordNet.
13
14       Morphology in WordNet uses two types of processes to try to convert the
15       string  passed  into  one  that  can  be found in the WordNet database.
16       There are lists of inflectional endings, based on  syntactic  category,
17       that can be detached from individual words in an attempt to find a form
18       of the word that is in WordNet.  There are also exception  list  files,
19       one  for  each  syntactic  category, in which a search for an inflected
20       form is done.  Morphy tries to use these two processes in  an  intelli‐
21       gent  manner  to  translate the string passed to the base form found in
22       WordNet.  Morphy first checks for exceptions, then uses  the  rules  of
23       detachment.   The  Morphy  functions  are not independent from WordNet.
24       After each transformation, WordNet is searched for the resulting string
25       in the syntactic category specified.
26
27       The  Morphy  functions are passed a string and a syntactic category.  A
28       string is either a single word or a  collocation.   Since  some  words,
29       such  as  axes  can have more than one base form (axe and axis), Morphy
30       works in the following manner.  The first time that  Morphy  is  called
31       with  a  specific  string, it returns a base form.  For each subsequent
32       call to Morphy made with a NULL string argument, Morphy returns another
33       base form.  Whenever Morphy cannot perform a transformation, whether on
34       the first call for a word or subsequent calls,  NULL  is  returned.   A
35       transformation  to  a valid English string will return NULL if the base
36       form of the string is not in WordNet.
37
38       The morphological functions are found  in  the  WordNet  library.   See
39       morph(3) for information on using these functions.
40
41   Rules of Detachment
42       The following table shows the rules of detachment used by Morphy.  If a
43       word ends with one of the suffixes, it is stripped from  the  word  and
44       the  corresponding  ending  is added.  Then WordNet is searched for the
45       resulting string.  No rules are applicable to adverbs.
46
47                                    │        │
48                               POS  Suffix Ending
49                               ─────┼────────┼────────
50                               NOUN │ "s"    │ ""
51                               NOUN │ "ses"  │ "s"
52                               NOUN │ "xes"  │ "x"
53                               NOUN │ "zes"  │ "z"
54                               NOUN │ "ches" │ "ch"
55                               NOUN │ "shes" │ "sh"
56                               NOUN │ "men"  │ "man"
57                               NOUN │ "ies"  │ "y"
58                               VERB │ "s"    │ ""
59                               VERB │ "ies"  │ "y"
60                               VERB │ "es"   │ "e"
61                               VERB │ "es"   │ ""
62                               VERB │ "ed"   │ "e"
63                               VERB │ "ed"   │ ""
64                               VERB │ "ing"  │ "e"
65                               VERB │ "ing"  │ ""
66
67                               ADJ  │ "er"   │ ""
68                               ADJ  │ "est"  │ ""
69                               ADJ  │ "er"   │ "e"
70                               ADJ  │ "est"  │ "e"
71
72   Exception Lists
73       There is one exception list file  for  each  syntactic  category.   The
74       exception  lists  contain the morphological transformations for strings
75       that are not regular and therefore cannot be processed in an  algorith‐
76       mic  manner.  Each line of an exception list contains an inflected form
77       of a word or collocation, followed by one or more base forms.  The list
78       is kept in alphabetical order and a binary search is used to find words
79       in these lists.  See wndb(5) for  information  on  the  format  of  the
80       exception list files.
81
82   Single Words
83       In  general, single words are relatively easy to process.  Morphy first
84       looks for the word in the exception list.  If it  is  found  the  first
85       base  form  is  returned.  Subsequent calls with a NULL argument return
86       additional base forms, if present.  A NULL is returned when  there  are
87       no more base forms of the word.
88
89       If  the  word  is  not found in the exception list corresponding to the
90       syntactic category, an algorithmic process using the rules  of  detach‐
91       ment  looks  for  a  matching suffix.  If a matching suffix is found, a
92       corresponding ending is  applied  (sometimes  this  ending  is  a  NULL
93       string,  so in effect the suffix is removed from the word), and WordNet
94       is consulted to see if the resulting word is found in the desired  part
95       of speech.
96
97   Collocations
98       As  opposed  to  single  words,  collocations can be quite difficult to
99       transform into a base form that is present  in  WordNet.   In  general,
100       only  base  forms  of  words,  even  those comprising collocations, are
101       stored in WordNet, such as attorney general.  Transforming the colloca‐
102       tion  attorneys general  is  then  simply  a matter of finding the base
103       forms of the individual words comprising the collocation.  This usually
104       works  for  nouns, therefore non-conforming nouns, such as customs duty
105       are presently entered in the noun exception list.
106
107       Verb collocations that contain prepositions, such  as  ask for it,  are
108       more  difficult.   As with single words, the exception list is searched
109       first.  If the collocation is not found, special code in Morphy  deter‐
110       mines whether a verb collocation includes a preposition.  If it does, a
111       function is called to try to find the base form in the  following  man‐
112       ner.   It  is  assumed that the first word in the collocation is a verb
113       and that the last word is a noun.  The algorithm then builds  a  search
114       string  with the base forms of the verb and noun, leaving the remainder
115       of the collocation (usually just the preposition, but more words may be
116       involved)  in the middle.  For example, passed asking for it, the data‐
117       base search would be performed with ask for it, which is found in Word‐
118       Net,  and  therefore  returned from Morphy.  If a verb collocation does
119       not contain a preposition, then the base form of each word in the  col‐
120       location is found and WordNet is searched for the resulting string.
121
122   Hyphenation
123       Hyphenation  also presents special difficulties when searching WordNet.
124       It is often a subjective decision as to whether a word  is  hyphenated,
125       joined  as one word, or is a collocation of several words, and which of
126       the various forms are entered  into  WordNet.   When  Morphy  breaks  a
127       string  into  "words",  it  looks for both spaces and hyphens as delim‐
128       iters.  It also looks for periods in strings and  removes  them  if  an
129       exact  match  is  not  found.   A  search for an abbreviation like oct.
130       return the synset for { October, Oct }.  Not every pattern  of  hyphen‐
131       ated  and  collocated  string  is  searched  for properly, so it may be
132       advantageous to specify several search strings  if  the  results  of  a
133       search attempt seem incomplete.
134
135   Special Processing for nouns ending with 'ful'
136       Morphy  contains  code that searches for nouns ending with ful and per‐
137       forms a transformation on the substring preceeding it.  It then appends
138       'ful'  back  onto  the resulting string and returns it. For example, if
139       passed the nouns boxesful, it will return boxful.
140

BUGS

142       Since  many  noun   collocations   contains   prepositions,   such   as
143       line of products, an algorithm similar to that used for verbs should be
144       written for  nouns.   In  the  present  scheme,  if  Morphy  is  passed
145       lines of products,  the search string becomes line of product, which is
146       not in WordNet
147
148       Morphy will allow non-words to be converted to words,  if  they  follow
149       one of the rules described above.  For example, it will happily convert
150       plantes to plants.
151

ENVIRONMENT VARIABLES (UNIX)

153       WNHOME              Base   directory   for   WordNet.     Default    is
154                           /usr/local/WordNet-3.0.
155
156       WNSEARCHDIR         Directory  in  which  the WordNet database has been
157                           installed.  Default is WNHOME/dict.
158

REGISTRY (WINDOWS)

160       HKEY_LOCAL_MACHINE\SOFTWARE\WordNet\3.0\WNHome
161                           Base directory for  WordNet.   Default  is  C:\Pro‐
162                           gram Files\WordNet\3.0.
163

FILES

165       pos.exc             morphology exception lists
166

SEE ALSO

168       wn(1), wnb(1), binsrch(3), morph(3), wndb(5), wninput(7).
169
170
171
172WordNet 3.0                        Dec 2006                          MORPHY(7)
Impressum