1WNINPUT(5)                   WordNet™ File Formats                  WNINPUT(5)
2
3
4

NAME

6       noun.suffix,  verb.suffix,  adj.suffix, adv.suffix - WordNet lexicogra‐
7       pher files that are input to grind(1)
8

DESCRIPTION

10       WordNet's source files are written by  lexicographers.   They  are  the
11       product of a detailed relational analysis of lexical semantics: a vari‐
12       ety of lexical and semantic relations are used to represent the organi‐
13       zation  of lexical knowledge.  Two kinds of building blocks are distin‐
14       guished in the source files: word forms and word meanings.  Word  forms
15       are represented in their familiar orthography; word meanings are repre‐
16       sented by synonym sets (synsets) - lists of synonymous word forms  that
17       are interchangeable in some context.  Two kinds of relations are recog‐
18       nized: lexical and  semantic.   Lexical  relations  hold  between  word
19       forms; semantic relations hold between word meanings.
20
21       Lexicographer  files correspond to the syntactic categories implemented
22       in WordNet - noun, verb, adjective and adverb.  All of the synsets in a
23       lexicographer  file  are  in  the same syntactic category.  Each synset
24       consists of a list of synonymous words or collocations  (eg.  "fountain
25       pen", "take in"), and pointers that describe the relations between this
26       synset and other synsets.  These relations include (but are not limited
27       to) hypernymy/hyponymy, antonymy, entailment, and meronymy/holonymy.  A
28       word or collocation may appear in more than one  synset,  and  in  more
29       than  one  part of speech.  Each use of a word in a synset represents a
30       sense of that word in the part of speech corresponding to the synset.
31
32       Adjectives may be organized into clusters containing head  synsets  and
33       satellite  synsets.   Adverbs  generally  point  to the adjectives from
34       which they are derived.
35
36       See wngloss(7) for a glossary of WordNet terminology and  a  discussion
37       of the database's content and logical organization.
38
39   Lexicographer File Names
40       The names of the lexicographer files are of the form:
41
42              pos.suffix
43
44       where  pos  is  either  noun,  verb, adj or adv.  suffix may be used to
45       organize groups of synsets into different files, for example  noun.ani‐
46       mal  and  noun.plant.  See lexnames(5) for a list of lexicographer file
47       names that are used in building WordNet.
48
49   Pointers
50       Pointers are used to represent the relations between the words  in  one
51       synset and another.  Semantic pointers represent relations between word
52       meanings, and therefore pertain to all of the words in the  source  and
53       target  synsets.   Lexical  pointers  represent  relations between word
54       forms, and pertain only to specific words  in  the  source  and  target
55       synsets.  The following pointer types are usually used to indicate lex‐
56       ical relations: Antonym, Pertainym, Participle, Also  See,  Derivation‐
57       ally Related.  The remaining pointer types are generally used to repre‐
58       sent semantic relations.
59
60       A relation from a source to a target synset is formed by  specifying  a
61       word  from  the  target  synset  in  the source synset, followed by the
62       pointer_symbol indicating the pointer type.  The location of a  pointer
63       within a synset defines it as either lexical or semantic.  The Lexicog‐
64       rapher File Format section describes the syntax for entering a semantic
65       pointer,  and  Word  Syntax describes the syntax for entering a lexical
66       pointer.
67
68       Although there are many pointer types, only certain types of  relations
69       are permitted between synsets of each syntactic category.
70
71       The pointer_symbols for nouns are:
72              !    Antonym
73              @    Hypernym
74              @i   Instance Hypernym
75              Hyponym
76              ∼i   Instance Hyponym
77              #m   Member holonym
78              #s   Substance holonym
79              #p   Part holonym
80              %m   Member meronym
81              %s   Substance meronym
82              %p   Part meronym
83              =    Attribute
84              +    Derivationally related form
85              ;c   Domain of synset - TOPIC
86              -c   Member of this domain - TOPIC
87              ;r   Domain of synset - REGION
88              -r   Member of this domain - REGION
89              ;u   Domain of synset - USAGE
90              -u   Member of this domain - USAGE
91
92       The pointer_symbols for verbs are:
93              !    Antonym
94              @    Hypernym
95              Hyponym
96              *    Entailment
97              >    Cause
98              ^    Also see
99              $    Verb Group
100              +    Derivationally related form
101              ;c   Domain of synset - TOPIC
102              ;r   Domain of synset - REGION
103              ;u   Domain of synset - USAGE
104
105       The pointer_symbols for adjectives are:
106              !    Antonym
107              &    Similar to
108              <    Participle of verb
109              \    Pertainym (pertains to noun)
110              =    Attribute
111              ^    Also see
112              ;c   Domain of synset - TOPIC
113              ;r   Domain of synset - REGION
114              ;u   Domain of synset - USAGE
115
116       The pointer_symbols for adverbs are:
117              !    Antonym
118              \    Derived from adjective
119              ;c   Domain of synset - TOPIC
120              ;r   Domain of synset - REGION
121              ;u   Domain of synset - USAGE
122
123       Many  pointer  types are reflexive, meaning that if a synset contains a
124       pointer to another synset, the other synset  should  contain  a  corre‐
125       sponding  reflexive  pointer.   grind(1)  automatically inserts missing
126       reflexive pointers for the following pointer types:
127
128
129                  ┌───────────────────────┬────────────────────────┐
130Pointer         Reflect         
131                  ├───────────────────────┼────────────────────────┤
132                  │Antonym                │ Antonym                │
133                  │Hyponym                │ Hypernym               │
134                  │Hypernym               │ Hyponym                │
135                  │Instance Hyponym       │ Instance Hypernym      │
136                  │Instance Hypernym      │ Instance Hyponym       │
137                  │Holonym                │ Meronym                │
138                  │Meronym                │ Holonym                │
139                  │Similar to             │ Similar to             │
140                  │Attribute              │ Attribute              │
141                  │Verb Group             │ Verb Group             │
142                  │Derivationally Related │ Derivationally Related │
143                  │Domain of synset       │ Member of Doman        │
144                  └───────────────────────┴────────────────────────┘
145   Verb Frames
146       Each verb synset contains a list of generic sentence frames  illustrat‐
147       ing  the types of simple sentences in which the verbs in the synset can
148       be used.  For some verb senses, example sentences  illustrating  actual
149       uses  of  the  verb  are  provided.   (See  Verb  Example  Sentences in
150       wndb(5).)  Whenever there is no example sentence, the generic  sentence
151       frames  specified  by the lexicographer are used.  The generic sentence
152       frames are entered in a synset as a  comma-separated  list  of  integer
153       frame  numbers.   The following list is the text of the generic frames,
154       preceded by their frame numbers:
155
156              1                      Something ----s
157              2                      Somebody ----s
158              3                      It is ----ing
159              4                      Something is ----ing PP
160              5                      Something ----s something Adjective/Noun
161              6                      Something ----s Adjective/Noun
162              7                      Somebody ----s Adjective
163              8                      Somebody ----s something
164              9                      Somebody ----s somebody
165              10                     Something ----s somebody
166              11                     Something ----s something
167              12                     Something ----s to somebody
168              13                     Somebody ----s on something
169              14                     Somebody ----s somebody something
170              15                     Somebody ----s something to somebody
171              16                     Somebody ----s something from somebody
172              17                     Somebody ----s somebody with something
173              18                     Somebody ----s somebody of something
174              19                     Somebody ----s something on somebody
175              20                     Somebody ----s somebody PP
176              21                     Somebody ----s something PP
177              22                     Somebody ----s PP
178              23                     Somebody's (body part) ----s
179              24                     Somebody ----s somebody to INFINITIVE
180              25                     Somebody ----s somebody INFINITIVE
181              26                     Somebody ----s that CLAUSE
182              27                     Somebody ----s to somebody
183              28                     Somebody ----s to INFINITIVE
184              29                     Somebody ----s whether INFINITIVE
185              30                     Somebody ----s somebody into V-ing something
186              31                     Somebody ----s something with something
187              32                     Somebody ----s INFINITIVE
188              33                     Somebody ----s VERB-ing
189              34                     It ----s that CLAUSE
190              35                     Something ----s INFINITIVE
191
192   Lexicographer File Format
193       Synsets are entered one per line, and each line is  terminated  with  a
194       newline character.  A line containing a synset may be as long as neces‐
195       sary, but no newlines can be entered within a synset.  Within a synset,
196       spaces  or  tabs  may  be used to separate entities.  Items enclosed in
197       italicized square brackets may not be present.
198
199       The general synset syntax is:
200
201              {   words  pointers   (  gloss  )  }
202
203       Synsets of this form are valid  for  all  syntactic  categories  except
204       verb,  and  are  referred to as basic synsets.  At least one word and a
205       gloss are required to form a valid synset.  Pointers entered  following
206       all  the words in a synset represent semantic relations between all the
207       words in the source and target synsets.
208
209       For verbs, the basic synset syntax is defined as follows:
210
211              {   words  pointers  frames   (  gloss  )  }
212
213       Adjective may be organized into clusters containing one  or  more  head
214       synsets  and optional satellite synsets.  Adjective clusters are of the
215       form:
216
217              [
218              head synset
219              [satellite synsets]
220              [-]
221              [additional head/satellite synsets]
222              ]
223
224       Each adjective cluster is enclosed in square brackets, and may have one
225       or more parts.  Each part consists of a head synset and optional satel‐
226       lite synsets that are conceptually similar to the head  synset's  mean‐
227       ing.   Parts of a cluster are separated by one or more hyphens (-) on a
228       line by themselves, with the terminating square bracket  following  the
229       last  synset.   Head  and  satellite synsets follow the syntax of basic
230       synsets, however a "Similar to" pointer must be  specified  in  a  head
231       synset for each of its satellite synsets.  Most adjective clusters con‐
232       tain two antonymous parts.  See wngloss(7) for a discussion  of  adjec‐
233       tive  clusters,  and  Special  Adjective Syntax for more information on
234       adjective cluster syntax.
235
236       Synsets for relational adjectives (pertainyms) and  participial  adjec‐
237       tives  do  not  adhere  to  the  cluster structure.  They use the basic
238       synset syntax.
239
240       Comments can be entered in a lexicographer file by enclosing  the  text
241       of the comment in parentheses.  Note that comments cannot appear within
242       a synset, as parentheses within a synset  have  an  entirely  different
243       meaning  (see  Gloss  Syntax  ).  However, entire synsets (or adjective
244       clusters) can be "commented out"  by  enclosing  them  in  parentheses.
245       This  is often used by the lexicographers to verify the syntax of files
246       under development or to leave  a  note  to  oneself  while  working  on
247       entries.
248
249   Word Syntax
250       A  synset  must  have at least one word, and the words of a synset must
251       appear after the opening brace and before any other synset  constructs.
252       A word may be entered in either the simple word or word/pointer syntax.
253
254       A simple word is of the form:
255
256              word[ ( marker ) ][lex_id] ,
257
258       word  may  be entered in any combination of upper and lower case unless
259       it is in an adjective cluster.  A collocation is entered by joining the
260       individual words with an underscore character (_).  Numbers (integer or
261       real) may be entered, either by themselves or as part of a word string,
262       by following the number with a double quote (").
263
264       See  Special  Adjective  Syntax for a description of adjective clusters
265       and markers.
266
267       word may be followed by an integer lex_id from 1 to 15.  The lex_id  is
268       used to distinguish different senses of the same word within a lexicog‐
269       rapher file.  The  lexicographer  assigns  lex_id  values,  usually  in
270       ascending  order,  although there is no requirement that the numbers be
271       consecutive.  The default is 0, and does not have to be  specified.   A
272       lex_id  must  be  used  on pointers if the desired sense has a non-zero
273       lex_id in its synset specification.
274
275       Word/pointer syntax is of the form:
276
277              [   word[ ( marker ) ][lex_id] ,   pointers   ]
278
279       This syntax is used when one or more pointers correspond  only  to  the
280       specific word in the word/pointer set, rather than all the words in the
281       synset, and represents a lexical relation.  Note  that  a  word/pointer
282       set  appears  within  a  synset,  therefore the square brackets used to
283       enclose it are treated differently from those used to define an  adjec‐
284       tive cluster.  Only one word can be specified in each word/pointer set,
285       and any number of pointers may be included.  A synset can have any num‐
286       ber of word/pointer sets.  Each is treated by grind(1) essentially as a
287       word, so they all must appear before any synset  pointers  representing
288       semantic relations.
289
290       For  verbs, the word/pointer syntax is extended in the following manner
291       to allow the user to specify generic sentence frames that, like  point‐
292       ers,  correspond  only to a specific word, rather than all the words in
293       the synset.  In this case, pointers are optional.
294
295              [   word ,   [pointers]  frames   ]
296
297   Pointer Syntax
298       Pointers are optional in synsets.  If a pointer is specified outside of
299       a  word/pointer set, the relation is applied to all of the words in the
300       synset, including any words specified using  the  word/pointer  syntax.
301       This indicates a semantic relation between the meanings of the words in
302       the synsets.  If specified within a word/pointer set, the relation cor‐
303       responds only to the word in the set and represents a lexical relation.
304
305       A pointer is of the form:
306
307              [lex_filename: ]word[lex_id],pointer_symbol
308
309       or:
310
311              [lex_filename: ]word[lex_id]^word[lex_id],pointer_symbol
312
313       For pointers, word indicates a word in another synset.  When the second
314       form of a pointer is used, the first word indicates a word  in  a  head
315       synset,  and the second is a word in a satellite of that cluster.  word
316       may be followed by a lex_id that is used to match the  pointer  to  the
317       correct  target  synset.   The  synset  containing  word  may reside in
318       another  lexicographer  file.   In  this  case,  word  is  preceded  by
319       lex_filename as shown.
320
321       See Pointers for a list of pointer_symbols and their meanings.
322
323   Verb Frame List Syntax
324       Frame  numbers corresponding to generic sentence frames must be entered
325       in each verb synset.  If  a  frame  list  is  specified  outside  of  a
326       word/pointer set, the verb frames in the list apply to all of the words
327       in the synset, including any words  specified  using  the  word/pointer
328       syntax.  If specified within a word/pointer set, the verb frames in the
329       list correspond only to the word in the set.
330
331       A frame number list is entered as follows:
332
333              frames:  f_num[,f_num...]
334
335       Where f_num specifies a generic frame number.  See Verb  Frames  for  a
336       list of generic sentences and their corresponding frame numbers.
337
338   Gloss Syntax
339       A gloss is included in all synsets.  The lexicographer may enter a text
340       string of any length desired.  A gloss is simply a string  enclosed  in
341       parentheses  with  no embedded carriage returns.  It provides a defini‐
342       tion of what the synset represents and/or example sentences.
343
344   Special Adjective Syntax
345       The syntax for representing antonymous adjective synsets requires  sev‐
346       eral additional conditions.
347
348       The  first word of a head synset must be entered in upper case, and can
349       be thought of as the head word of the head synset.  The word part of  a
350       pointer  from  one  head  synset to another head synset within the same
351       cluster (usually an antonym) must also be entered in upper case.   Usu‐
352       ally  antonymous  adjectives  are entered using the word/pointer syntax
353       described in Word Syntax to indicate a lexical relation.  There  is  no
354       restriction  on  the  number of parts that a cluster may have, and some
355       clusters have three parts, representing antonymous  triplets,  such  as
356       solid, liquid, and gas.
357
358       A  cross-cluster pointer may be specified, allowing a head or satellite
359       synset to point to a head synset in a different cluster.  A cross-clus‐
360       ter  pointer  is  indicated by entering the word part of the pointer in
361       upper case.
362
363       An adjective may be annotated with a syntactic marker indicating a lim‐
364       itation on the syntactic position the adjective may have in relation to
365       noun that it modifies.  If so marked, the marker  appears  between  the
366       word  and  its  following  comma.  If a lex_id is specified, the marker
367       immediately follows it.  The syntactic markers are:
368              (p)                    predicate position
369              (a)                    prenominal (attributive) position
370              (ip)                   immediately postnominal position
371

EXAMPLES

373       (Note that these are hypothetical examples not  found  in  the  WordNet
374       lexicographer files.)
375
376       Sample noun synsets:
377              { canine, [ dog1, cat,! ] pooch, canid,@ }
378              { collie, dog1,@ (large multi-colored dog with pointy nose) }
379              { hound, hunting_dog, pack,#m dog1,@ }
380              { dog, }
381
382       Sample verb synsets:
383              { [ confuse, clarify,! frames: 1 ] blur, obscure, frames: 8, 10 }
384              { [ clarify, confuse,! ] make_clear, interpret,@ frames: 8 }
385              { interpret, construe, understand,@ frames: 8 }
386
387       Sample adjective clusters:
388              [
389              { [ HOT, COLD,! ] lukewarm(a), TEPID,^ (hot to the touch) }
390              { warm, }
391              -
392              { [ COLD, HOT,! ] frigid, (cold to the touch) }
393              { freezing, }
394              ]
395
396       Sample adverb synsets:
397              { [ basically, adj.all:essential^basic,\ ] [ essentially, adj.all:basic^fundamental,\ ] ( by one's very nature )}
398              { pointedly, adj.all:pungent^pointed,\ }
399              { [ badly, adj.all:bad,\ well,! ] ill, ("He was badly prepared") }
400

SEE ALSO

402       grind(1), wnintro(5), lexnames(5), wndb(5), uniqbeg(7), wngloss(7).
403
404       Fellbaum,  C.  (1998),  ed.  "WordNet: An Electronic Lexical Database".
405       MIT Press, Cambridge, MA.
406
407
408
409
410WordNet 3.0                        Dec 2006                         WNINPUT(5)
Impressum