1GRIND(1)                    WordNet™ User Commands                    GRIND(1)
2
3
4

NAME

6       grind - process WordNet lexicographer files
7

SYNOPSIS

9       grind  [  -v  ] [ -s ] [ -Llogfile ] [ -a ] [ -d ] [ -i ] [ -o ] [ -n ]
10       filename [ filename... ]
11

DESCRIPTION

13       grind() processes WordNet lexicographer files, producing database files
14       suitable  for  use with the WordNet search and interface code and other
15       applications.  The syntactic and  structural  integrity  of  the  input
16       files  is  verified.  Warnings and errors are reported via stderr and a
17       run-time log is produced on stdout.  A database is  generated  only  if
18       there are no errors.
19
20   Input Files
21       Input files correspond to the syntactic categories implemented in Word‐
22       Net - noun, verb, adjective and adverb.  Each input lexicographer  file
23       consists  of  a  list of synonym sets (synsets) for one part of speech.
24       Although the basic synset syntax is the same for all of  the  parts  of
25       speech,  some  parts  of  the syntax only apply to a particular part of
26       speech.  See wninput(5) for a description of the input file format.
27
28       Each filename specified is of the form:
29
30              pathname/pos.suffix
31
32       where pathname is optional and pos is either noun, verb,  adj  or  adv.
33       suffix  may be used to separate groups of synsets into different files,
34       for example noun.animal and noun.plant.  One or more  input  files,  in
35       any  combination  of  syntactic categories, may be specified.  See lex‐
36       names(5) for a list of the lexicographer files used to build  the  com‐
37       plete WordNet database.
38
39   Output Files
40       grind() produces the following output files:
41
42
43               ┌────────────┬────────────────────────────────────────┐
44Filename   Description               
45               ├────────────┼────────────────────────────────────────┤
46index.pos   │ Index file for each syntactic category │
47data.pos    │ Data file for each syntactic category  │
48index.sense │ Sense index                            │
49               └────────────┴────────────────────────────────────────┘
50       See wndb(5) for a description of the database file formats.
51
52       Each  time  grind() is run, any existing database files are overwritten
53       with the database files generated from the specified input  files.   If
54       no input files from a syntactic category are specified, the correspond‐
55       ing database files are not overwritten.
56
57   Sense Numbers
58       Senses are generally ordered from most to least frequently  used,  with
59       the  most  common  sense numbered 1.  Frequency of use is determined by
60       the number of times a sense is tagged in the various  semantic  concor‐
61       dance  texts.   Senses  that  are  not  semantically  tagged follow the
62       ordered senses in an arbitrary order.  Note that this ordering is  only
63       an estimate based on usage in a small corpus.
64
65       The  tagsense_cnt field for each entry in the index.pos files indicates
66       how many of the senses in the list have been tagged.
67
68       The cntlist file provided with the database lists the number  of  times
69       each  sense  is  tagged in the semantic concordances.  grind() uses the
70       data from cntlist to order the senses of each word.  When the index.pos
71       files  are  generated,  the  synset_offsets  are output in sense number
72       order, with sense 1 first in the list.  Senses with the same number  of
73       semantic  tags  are assigned unique but consecutive sense numbers.  The
74       WordNet OVERVIEW search displays all senses of the specified  word,  in
75       all  syntactic categories, and indicates which of the senses are repre‐
76       sented in the semantically tagged texts.
77

OPTIONS

79       -v             Verify integrity of input without generating database.
80
81       -s             Suppress generation of warning messages.  Usually  grind
82                      is  run  with this option until all syntactic and struc‐
83                      tural errors are corrected since  the  warning  messages
84                      may make it difficult to spot error messages.
85
86       -Llogfile      Write all messages to logfile instead of stderr.
87
88       -a             Generate statistical report on input files processed.
89
90       -d             Generate  distribution of senses by string length report
91                      on input files processed.
92
93       -i             Generate sense index file.
94
95       -o             Order senses using cntlist.
96
97       -n             Generate nominalization (derivational morphology)  links
98                      in database.
99
100       filename       Input file of the form described in Input Files.
101

FILES

103       pos.*               lexicographer files to use to build database
104
105       cntlist             file   of  combined  semantic  concordance  cntlist
106                           files.  Used to assign  sense  numbers  in  WordNet
107                           database
108

SEE ALSO

110       cntlist(5),  lexnames(5), senseidx(5), wndb(5), wninput(5), uniqbeg(7),
111       wngloss(7).
112

DIAGNOSTICS

114       Exit status is normally 0.  Exit status is  -1  if  non-specific  error
115       occurs.  If syntactic or structural errors exist, exit status is number
116       of errors detected.
117
118       usage: grind [-v] [-s] [-Llogfile] [-a ] [-d] [-i] [-o]  [-n]  filename
119       [filename...]
120              Invalid options were specified on the command line.
121
122       No input files processed.
123              None of the filenames specified were of the appropriate form.
124
125       n syntactic errors found.
126              Syntax errors were found while parsing the input files.
127
128       n structural errors found.
129              Pointer  errors  were found that could not be automatically cor‐
130              rected.
131

BUGS

133       Please report bugs to wordnet@princeton.edu.
134
135
136
137WordNet 3.0                        Dec 2006                           GRIND(1)
Impressum