1DICTD(8)                                                              DICTD(8)
2
3
4

NAME

6       dictd - a dictionary database server
7

SYNOPSIS

9       dictd [options]
10

DESCRIPTION

12       dictd  is  a  server  for  the Dictionary Server Protocol (DICT), a TCP
13       transaction based query/response  protocol  that  allows  a  client  to
14       access dictionary definitions from a set of natural language dictionary
15       databases.
16
17       For security reasons, dictd drops root permissions after  startup.   If
18       user  dictd  exists  on  the  system, the daemon will run as that user,
19       group dictd, otherwise it will run as  user  nobody,  group  nobody  or
20       nogroup (depending on the operating system distribution).
21
22       Since  startup  time is significant, the server is designed to run con‐
23       tinuously, and should not be run from inetd(8).  (However, with a  fast
24       processor, it is feasible to do so.)
25
26       Databases are distributed separately from the server.
27
28       By  default,  dictd  assumes that the index files are sorted alphabeti‐
29       cally, and only alphanumeric characters from the 7-bit ASCII  character
30       set are used for search.  This default may be overridden by a header in
31       the data file.  The only such features implemented at this time are the
32       headers  "00-database-allchars" which tells dictd that non-alphanumeric
33       characters may also be used for search, the  header  "00-database-utf8"
34       which indicates that the database uses utf8 encoding, and the "00-data‐
35       base-8bit-new" which indicates that the database is encoded and  sorted
36       according to a locale that uses an 8-bit encoding.
37
38       A header "00-database-plugin" may also be present and is used for inte‐
39       grating plugins into dictd. See "dictfmt_plugin --help" and  "dictdplu‐
40       gin.h" for more information.
41
42       A header "00-database-virtual" identifies "virtual dictionaries", which
43       are lists of real dictionaries to be searched by dictd.
44

BACKGROUND

46       For many years, the Internet community has relied on the "webster" pro‐
47       tocol for access to natural language definitions.  The webster protocol
48       supports access to a single dictionary and  (optionally)  to  a  single
49       thesaurus.   In  recent years, the number of publicly available webster
50       servers on the Internet has dramatically decreased.
51
52       Fortunately, several  freely-distributable  dictionaries  and  lexicons
53       have recently become available on the Internet.  However, these freely-
54       distributable databases are not accessible via a uniform interface, and
55       are not accessible from a single site.  They are often small and incom‐
56       plete individually, but would collectively provide an  interesting  and
57       useful  database  of  English words.  Examples include the Jargon file,
58       the WordNet database, MICRA's version of  the  1913  Webster's  Revised
59       Unabridged  Dictionary,  and  the  Free Online Dictionary of Computing.
60       (See the DICT protocol specification (RFC) for references.)   Translat‐
61       ing and non-English dictionaries are also becoming available (for exam‐
62       ple, the FOLDOC dictionary is being translated into Spanish).
63
64       The webster protocol is not suitable for providing access  to  a  large
65       number  of separate dictionary databases, and extensions to the current
66       webster protocol were not felt to be a clean solution to the dictionary
67       database problem.
68
69       The  DICT protocol is designed to provide access to multiple databases.
70       Word definitions can be requested,  the  word  index  can  be  searched
71       (using  an  easily  extended  set of algorithms), information about the
72       server can be provided (e.g., which index search  strategies  are  sup‐
73       ported,  or  which  databases  are  available), and information about a
74       database can be provided (e.g., copyright,  citation,  or  distribution
75       information).  Further, the DICT protocol has hooks that can be used to
76       restrict access to some or all of the databases.
77
78       dictd(8) is a server that implements the DICT  protocol.   Bret  Martin
79       implemented  another  server,  and  several  people (including Bret and
80       myself) have implemented clients in a variety of languages.
81

OPTIONS

83       -V or --version
84              Display version information.
85
86       --license
87              Display copyright and license information.
88
89       -h or --help
90              Display help information.
91
92       -v or --verbose or  -dverbose
93              Be verbose.
94
95       -c file or --config file
96              Specify configuration file.  The default  is  /etc/dictd.conf  ,
97              but   may  be  changed  in  the  defs.h  file  at  compile  time
98              (DICTD_CONFIG_FILE).
99
100       -p port or --port port
101              Specifies the port (e.g., 2628).  The default is 2628, as speci‐
102              fied  in the DICT Protocol RFC, but may be changed in the defs.h
103              file at compile time (DICT_DEFAULT_SERVICE).
104
105       -i or --inetd
106              Communicate on standard  input/output,  suitable  for  use  from
107              inetd.   Although,  due  to  its rather large startup time, this
108              daemon was not intended to run from inetd, with a fast processor
109              it is feasible to do so. This option also implies --fast-start.
110
111       --pp prog
112              Sets  a  preprocessor for configuarion file.  like  m4 or  cpp .
113              See example_complex.conf file from distribution. By default con‐
114              figuration file is parsed without preprocessor.
115
116       --depth length
117              Specify the queue length for listen(2).  Specifies the number of
118              pending socket connections which are  queued  by  the  operating
119              system.  Some operating systems may silently limit this value to
120              5 (older BSD systems) or 128 (Linux).  The default is 10 but may
121              be    changed    in    the   defs.h   file   at   compile   time
122              (DICT_QUEUE_DEPTH).
123
124       --delay seconds
125              Specifies the number of seconds a client may be idle before  the
126              server  will  close  the connection.  Idle time is defined to be
127              the time the server is waiting for input and  does  not  include
128              the  time the server spends searching the database.  Connections
129              are closed without warning since no provision for premature con‐
130              nection  termination is specified in the DICT protocol RFC.  The
131              default is 600 seconds (10 minutes), but may be changed  in  the
132              defs.h file at compile time (DICT_DEFAULT_DELAY).
133
134       --facility facility
135              Specifies  the  syslog  facility to use.  The use of this option
136              implies the -s option to turn on logging via syslog.   When  the
137              operating  system libraries support SYSLOG_NAMES, the names used
138              for this option should be those listed in syslog.conf(5).   Oth‐
139              erwise,  the  following  names are used (assuming the particular
140              facility is defined in the header files): auth, authpriv,  cron,
141              daemon,  ftp, kern, lpr, mail, news, syslog, user, uucp, local0,
142              local1, local2, local3, local4, local5, local6, and local7.
143
144       -f or --force
145              Force the daemon to start even if an instance of the  daemon  is
146              already  running.  (This is of little value unless a non-default
147              port is specified with -p, since, if one instance is bound to  a
148              port, the second one fails when it can not bind to the port.)
149
150       --limit children
151              Specifies  the  number of daemons that may be running simultane‐
152              ously.  Each daemon services a single connection.  If the  limit
153              is  exceeded,  a  (serialized)  connection  will  be made by the
154              server process, and a  response  code  420  (server  temporarily
155              unavailable)  will be sent to the client.  This parameter should
156              be adjusted to prevent the server machine from being  overloaded
157              by  dict clients, but should not be set so low that many clients
158              are denied useful connections. The default is 100,  but  may  be
159              changed in the defs.h file at compile time (DICT_DAEMON_LIMIT).
160
161       --listen-to address
162              Binds  socket  to  the  specified address.  If you want to allow
163              connections to dict server from  localhost  only,  apply  --lis‐
164              ten-to 127.0.0.1
165
166       --locale locale
167              Specifies the locale used for searching.  If no locale is speci‐
168              fied, the "C" locale is used.  The locale used  for  the  server
169              should  be  the  same as that used for dictfmt when the database
170              was built (specifically, the locale under which  the  index  was
171              sorted). The locale should be specified for both 8-bit and UTF-8
172              formats. If locale contains utf8 or utf-8 substring, UTF-8  for‐
173              mat is expected.  Note that if your database is not in ASCII7 or
174              UTF-8 format, then the dictd server will not be compliant to RFC
175              2229.
176
177              NOTE If utf-8 or 8-bit dictionaries are included in the configu‐
178              ration file, and the appropriate --locale has  not  been  speci‐
179              fied,  dictd  will  fail to start.  This implies that dictd will
180              not run with both utf-8 and 8-bit dictionaries in the configura‐
181              tion file.
182
183       -s     Log using the syslog(3) facility.
184
185       -L file or --logfile file
186              Specify  the file for logging.  The filename specified is recom‐
187              puted on each use using the strftime(3) call.   For  example,  a
188              filename  ending  in ".%Y%m%d" will write to log files ending in
189              the year, month, and date that the log entry was written.  NOTE:
190              If  dictd  does not have write permission for this file, it will
191              silently fail.
192
193       -m minutes  or --mark minutes
194              How often a timestamp should be logged.  (This is effective only
195              if  logging has been enabled with the -s or -L option, or with a
196              debugging option.)
197
198       --default-strategy strategy
199              Set the server's default search strategy for MATCH search  type.
200              The  default is 'lev'. It is also possible to set default strat‐
201              egy per database.   See  default_strategy  keyword  in  Database
202              specification section.
203
204       --without-strategy strat1,strat2,...
205              Disable  specified strategies.  By default all search strategies
206              are enabled.
207
208       --add-strategy strat:descr
209              Adds strategy 'strat'  with  the  description  'descr'.   A  new
210              search strategy may be implemented with a help of plugins.
211
212       --test word  or -t word
213              self test -- lookup word
214
215       --test-file file or --ftest file
216              self test -- lookup all words in file
217
218       --test-strategy strategy
219              self  test  --  set search strategy for --test and --ftest.  The
220              default is 'exact'.
221
222       --test-db database
223              self test -- set dictionary to be searched. The default is '*'.
224
225       --test-match
226              self test -- set search type to MATCH. The default is DEFINE.
227
228       --fast-start
229              By default, dictd creates (in memory) additional index  to  make
230              the  search  faster.   This  option  disables this behaviour and
231              makes startup faster.
232
233       --without-mmap
234              do not use the mmap() function and read entire files into memory
235              instead.   Use  this  option,  if  you know exactly what you are
236              doing.
237
238
239       -l option or --log option
240              Specify a logging option.  This is effective only if logging has
241              been enabled with the -s or -L option, or logging to the console
242              has been activated with a debugging option (e.g., --debug  node‐
243              tach.   Only  one option may be set with each invocation of this
244              option; however, multiple invocations of this option may be made
245              in one dictd command line.  For instance:
246              dictd -s --log stats --log found --log notfound
247              is a valid command line, and sets three logging options.
248
249              Some  of the more verbose logging options are used primarily for
250              debugging the server code, and are not practical for normal use.
251
252              server Log server diagnostics.  This is extremely verbose.
253
254              connect
255                     Log all connections.
256
257              stats  Log all children terminations.
258
259              command
260                     Log all commands.  This is extremely verbose.
261
262              client Log results of CLIENT command.
263
264              found  Log all words found in the databases.
265
266              notfound
267                     Log all words not found in the databases.
268
269              timestamp
270                     When logging to a file, use a full  timestamp  like  that
271                     which  syslog  would produce.  Otherwise, no timestamp is
272                     made, making the files shorter.
273
274              host   Log name of foreign host.
275
276              auth   Log authentication failures.
277
278              min    Set a minimal number of options.  If logging is activated
279                     (to  a file, or via syslog), and no options are set, then
280                     the minimal set of options will be used.  If options  are
281                     set, then only those options specified will be used.
282
283              all    Set all of the options.
284
285              none   Clear all of the options.
286
287              To  facilitate  location  of  interesting information in the log
288              file, entries are marked with  initial  letters  indicating  the
289              class of the line being logged:
290
291              I      Information about the server, connections, or termination
292                     statistics.  These lines are generally not designed to be
293                     parsed automatically.
294
295              E      Error messages.
296
297              C      CLIENT command information.
298
299              D      Definitions found in the databases searched.
300
301              M      Matches found in the database searched.
302
303              N      Matches which were not found in the databases searched.
304
305              T      Trace of exact line sent by client.
306
307              A      Authentication information.
308
309              To  preserve  anonymity of the client, do not use the connect or
310              host options.  Clients may or  may  not  send  host  information
311              using  the  CLIENT command, but this should be an option that is
312              selectable on the client side.
313
314       -d option
315              Activate a debugging option.  There are several,  all  of  which
316              are  only  useful  to  developers.  They are documented here for
317              completeness.  A list can be obtained interactively by using  -d
318              with an illegal option.
319
320              verbose
321                     The  same  as  -v  or --verbose.  Adds verbosity to other
322                     options.
323
324              scan   Debug the scanner for the configuration file.
325
326              parse  Debug the parser for the configuration file.
327
328              search Debug the character folding and binary search routines.
329
330              init   Report database initialization.
331
332              port   Log client-side port number to the log file.
333
334              lev    Debug Levenshtein search algorithm.
335
336              auth   Debug the authorization routines.
337
338              nodetach
339                     Do not detach as a background process.   Implies  that  a
340                     copy of the log file will appear on the standard output.
341
342              nofork Do  not  fork  daemons to service requests.  Be a single-
343                     threaded server.  This option implies  nodetach,  and  is
344                     most  useful  for  using  a debugger to find the point at
345                     which daemon processes are dumping core.
346
347              alt    Debugs altcompare in index.c.
348

CONFIGURATION FILE

350       Introduction
351              The configuration file defaults to /etc/dictd.conf  but  can  be
352              specified on the command line with the -c option (see above).
353
354              The  configuration  file  is read into memory at startup, and is
355              not referenced again by dictd unless  a  signal  1  (SIGHUP)  is
356              received,  which  will  cause  dictd to reread the configuration
357              file.
358
359              The file is divided into sections.  The Site Section should come
360              first, followed by the Access Section, the Database Section, and
361              the User Section.  The Database Section is required; the  others
362              are optional, but they must be in the order listed here.
363
364       Syntax The  following  keywords  are  valid  in  a  configuration file:
365              access, allow, deny, group, database, data, index, filter,  pre‐
366              filter,  postfilter,  name, include, user, authonly, site.  Key‐
367              words are case sensitive.  String arguments that contain  spaces
368              should be surrounded by double quotes.  Without quoting, strings
369              may contain alphanumeric characters and _, -, ., and *, but  not
370              spaces.   Strings  can  be continued between lines.  \", \\, \n,
371              \<NL> are treated as double quote, backslash, new  line  and  no
372              symbol  respectively.   Comments  start with # and extend to the
373              end of the line.
374
375       Site Section
376
377              site string
378                     Used to specify the filename  for  the  site  information
379                     file,  a  flat  text  file  which  will  be  displayed in
380                     response to the SHOW SERVER command.   This  section,  if
381                     present, must be first.
382
383       Access Section
384
385              access { access specification }
386                     This  section, the second if the Site Section is present,
387                     contains access restrictions for the server  and  all  of
388                     the  databases  collectively.   Per-database  control  is
389                     specified in the Database Section.
390
391       Database Section
392
393              database string { database specification }
394                     The string specifies the name of the database  (e.g.,  wn
395                     or  web1913).  (This is an arbitrary name selected by the
396                     administrator, and is not necessarily related to the file
397                     name  or any name listed in the data file.  A short, easy
398                     to type name is often selected for  easy  use  with  dict
399                     -d.)
400
401                     NOTE:  If  the files specified in the database specifica‐
402                     tion do not exist on the system, dictd may silently fail.
403
404              database_virtual string { virtual database specification }
405                     This section specifies the virtual database.  The  string
406                     specifies the name of the database (e.g., en-ru or fren).
407
408              database_plugin string { plugin specification }
409                     This  section specifies the plugin.  The string specifies
410                     the name of the database.
411
412              database_exit
413                     Excludes following databases from the '*'  database.   By
414                     default  '*'  means  all  databases  available.   Look at
415                     'example_virtual.conf' file for example configuration.
416
417                     NOTE: If you use 'virtual' dictionaries, you  should  use
418                     this  directive,  otherwise you will search the same dic‐
419                     tionary twice.
420
421       User Section
422
423              user string string
424                     The first string specifies the username, and  the  second
425                     string  specifies  the  shared  secret for this username.
426                     When the AUTH command is used, the  client  will  provide
427                     the  username  and a hashed version of the shared secret.
428                     If the shared secret matches, the user is  said  to  have
429                     authenticated,  and  will  have access to databases whose
430                     access specifications allow that user  (by  name,  or  by
431                     wildcard).   If present, this section must appear last in
432                     the configuration file.  There may be many user  entries.
433                     The  shared  secret  should be kept secret, as anyone who
434                     has access to it can access the shared databases  (assum‐
435                     ing access is not denied by domain name).
436
437       Access Specification
438              Access  specifications may occur in the Access Section or in the
439              Database Section.  The access specification  will  be  described
440              here.
441
442              For  allow, deny, and authonly, a star (*) may be used as a wild
443              card that matches any number of characters.  A question mark (?)
444              may  be used as a wildcard that matches a single character.  For
445              example, 10.0.0.* and *.edu are valid strings.
446
447              Further, a range of IP addresses and an IP address followed by a
448              netmask  may  be  specified.   For example, 10.0.0.0:10.0.0.255,
449              10.0.0.0/24, and 10.0.0.* all specify the same range of IP  num‐
450              bers.   Notation  cannot  be  combined on the same line.  If the
451              notation does not make sense, access will be denied by  default.
452              Use the --debug auth option to debug related problems.
453
454              Note that these specifications take only one string per specifi‐
455              cation line.  However, you can have multiple lines of each type.
456
457              The syntax is as follows:
458
459              allow string
460                     The string specifies a domain name or IP address which is
461                     allowed  access  to the server (in the Access Section) or
462                     to a database (in the Database Section).  Note that  more
463                     than  one  string  is  not permitted for a single "allow"
464                     line, but more than one "allow" lines  are  permitted  in
465                     the configuration file.
466
467              deny string
468                     The string specifies a domain name or IP address which is
469                     denied access to the server (in the Access Section) or to
470                     a  database  (in  the  Database  Section).   Note that if
471                     reverse DNS is not working, then only the IP number  will
472                     be  checked.  Therefore, it is essential to deny networks
473                     based on IP number, since a denial based on  domain  name
474                     may not always be checked.
475
476              authonly string
477                     This  form  is  only  useful  in the Access Section.  The
478                     string specifies a domain name or  IP  address  which  is
479                     allowed  access to the server but not to any of the data‐
480                     bases.  All commands are valid except DEFINE, MATCH,  and
481                     SHOW  DB.  More specifically AUTH is a valid command, and
482                     commands which access the databases are not allowed.
483
484              user string
485                     This form is only useful in the  Database  Section.   The
486                     string  specifies  a  username  that is allowed to access
487                     this database after a successful  AUTH  command  is  exe‐
488                     cuted.
489
490       Database Specification
491              The database specification describes the database:
492
493              data string
494                     Specifies  the  filename  for the flat text database.  If
495                     the filename does not  begin  with  '.'  or  '/',  it  is
496                     prepended  with  $datadir/.  It is a compile time option.
497                     You can change this behaviour by editing Makefile or run‐
498                     ning ./configure --datadir=...
499
500              index string
501                     Specifies  the  filename for the index file.  Path matter
502                     is similar to that described above in "data" option .
503
504              index_suffix string
505                     This is optional  index  file  to  make  'suffix'  search
506                     strategy  faster  (binary  search).   It  is generated by
507                     'dictfmt_index2suffix'. Run "dictfmt_index2suffix --help"
508                     for  more  information.   Path  matter is similar to that
509                     described above in "data" option .
510
511              index_word string
512                     This is optional index file to make 'word' search  strat‐
513                     egy   faster   (binary   search).   It  is  generated  by
514                     'dictfmt_index2word'. Run "dictfmt_index2word --help" for
515                     more   information.   Path  matter  is  similar  to  that
516                     described above in "data" option .
517
518              prefilter string
519                     Specifies the  prefilter command.  When  a chunk  of  the
520                     compressed  database  is  read, it will be filtered  with
521                     this filter before being decompressed.  This may be  used
522                     to provide  some additional compression  that knows about
523                     the data and can provide better compression than the LZ77
524                     algorithm used by zlib.
525
526              postfilter string
527                     Specifies  the  postfilter  command.  When a chunk of the
528                     compressed database is read, it  will  be  filtered  with
529                     this  filter  before  the offset and length for the entry
530                     are used to access data.  This is provided  for  symmetry
531                     with  the  prefilter  command, and may also be useful for
532                     providing additional database compression.
533
534              filter string
535                     Specifies  the  filter  command.   After  the  entry   is
536                     extracted  from  the  database,  it will be filtered with
537                     this filter.  This may be used to provide formatting  for
538                     the entry (e.g., for html).
539
540              name string
541                     Specifies  the  short  name  of the database (e.g., "1913
542                     Webster's").  If the string begins with @, then it speci‐
543                     fies  the  headword  to look up in the dictionary to find
544                     the  short  name  of  the  database.   The   default   is
545                     "@00-database-short",  but  this  may  be  changed in the
546                     defs.h file at compile time (DICT_SHORT_ENTRY_NAME).
547
548              info string
549                     Specifies the information about database.  If the  string
550                     begins  with @, then it specifies the headword to look up
551                     in the dictionary to find information.   The  default  is
552                     "@00-database-info",  but  this  may  be  changed  in the
553                     defs.h file at compile time (DICT_INFO_ENTRY_NAME).
554
555              invisible
556                     Makes dictionary invisible to the clients i.e. this  dic‐
557                     tionary will not be recognized or shown by DEFINE, MATCH,
558                     SHOW INFO, SHOW SERVER and SHOW DB commands. If some def‐
559                     initions  or  matches  are found in invisible dictionary,
560                     the name of  the  upper  visible  virtual  dictionary  is
561                     returned.  Dictionaries '*' and '!' don't include invisi‐
562                     ble ones.  NOTE: There is no  sense  to  make  dictionary
563                     invisible  unless  it  is included to the virtual dictio‐
564                     nary.
565
566              disable_strategy string
567                     Disables the specified strategy for database.   This  may
568                     be  useful for slow dictionaries (plugins) or for dictio‐
569                     naries included to virtual ones.  For an example see file
570                     example_complex.conf.
571
572              default_strategy string
573                     Specifies the strategy which will be used if the database
574                     is accessed using the strategy '.'.  I.e. this  directive
575                     is the way to set the preferred search strategy per data‐
576                     base. For example, instead of strategy lev , the strategy
577                     word  may be prefered for databases mainly containing the
578                     multiword phrases but the single words.
579
580
581       Virtual Database Specification
582              The virtual database specification describes the  virtual  data‐
583              base:
584
585              database_list string
586                     Specifies a list of databases which are included into the
587                     virtual database.  Database names are in the  string  and
588                     are separated by comma.
589
590              name string
591                     Specifies  the  short  name of the database. See database
592                     specification
593
594              info string
595                     Specifies the information about  database.  See  database
596                     specification
597
598              invisible
599                     Makes  dictionary  invisible to the clients. See database
600                     specification
601
602              disable_strategy string
603                     Disables the specified strategy for database.  See  data‐
604                     base specification
605
606              NOTE:  Another  way to implement a virtual database is to create
607                     database files by dictfmt_virtual executable
608
609       Plugin Specification
610
611              plugin string
612                     Specifies a filename of the plugin.
613
614              data string
615                     Specifies data for initializing plugin.
616
617              name string
618                     Specifies the short name of the database.   See  Database
619                     Specification for more information.
620
621              info string
622                     Specifies  the  information about database.  See Database
623                     Specification for more information.
624
625              invisible
626                     Makes dictionary invisible to the clients.  See  Database
627                     Specification for more information.
628
629              disable_strategy string
630                     Disables  the specified strategy for database.  See Data‐
631                     base Specification for more information.
632
633              default_strategy string
634                     Sets the default search strategy for database.  See Data‐
635                     base Specification for more information.
636
637              NOTE:  Another  way  to  configure  plugin is to create database
638                     files by dictfmt_plugin executable
639
640       include string
641              The text of the file "string" (usually a database specification)
642              will  be read as if it appeared at this location in the configu‐
643              ration file.  Nested includes are not permitted.
644
645

DETERMINATION OF ACCESS LEVEL

647       When a client connects, the global access specification is scanned,  in
648       order,  until  a  specification  matches.   If  no access specification
649       exists, all access is allowed (e.g., the  action  is  the  same  as  if
650       "allow *" was the only item in the specification).  For each item, both
651       the hostname and IP are checked. For example,  consider  the  following
652       access specification:
653              allow 10.42.*
654              authonly *.edu
655              deny *
656       With  this  specification,  all  clients  in  the 10.42 network will be
657       allowed access to unrestricted databases; all clients from *.edu  sites
658       will be allowed to authenticate, but will be denied access to all data‐
659       bases, even those which  are  otherwise  unrestricted;  and  all  other
660       clients  will  have their connection terminated immediately.  The 10.42
661       network clients can send an AUTH command and gain access to  restricted
662       databases.   The *.edu clients must send an AUTH command to gain access
663       to any databases, restricted or unrestricted.
664
665       When the AUTH command is sent, the access list  for  each  database  is
666       scanned, in order, just as the global access list is scanned.  However,
667       after authentication, the client has an associated username.  For exam‐
668       ple, consider the following access specification:
669              user u1
670              deny *.com
671              user u2
672              allow *
673       If  the client authenticated as u1, then the client will have access to
674       this database, even if the client comes from a  *.com  site.   In  con‐
675       trast,  if  the  client  authenticated as u2, the client will only have
676       access if it does not come from a *.com site.  In this case, the  "user
677       u2" is redundant, since that client would also match "allow *".
678
679       Warning:  Checks  are  performed for domain names and for IP addresses.
680       However, if reverse DNS for a specific site is not working, it is  pos‐
681       sible  that a domain name may not be available for checking.  Make sure
682       that all denials use IP addresses.  (And consider a future enhancement:
683       if  a  domain  name  is  not available, should denials that depend on a
684       domain name match anything?  This is the more  conservative  viewpoint,
685       but it is not currently implemented.)
686

SEARCH ALGORITHMS

688       The DICT standard specifies a few search algorithms that must be imple‐
689       mented, and permits others to be supported on a server-dependent basis.
690       The  following  search  strategies  are supported by this server.  Note
691       that all strategies are case  insensitive.   Most  ignore  non-alphanu‐
692       meric, non-whitespace characters.
693
694       exact  An  exact match.  This algorithm uses a binary search and is one
695              of the fastest search algorithms available.
696
697       lev    The Levenshtein algorithm (string edit distance of  one).   This
698              algorithm  searches  for all words which are within an edit dis‐
699              tance of one from the target word.  An "edit"  means  an  inser‐
700              tion, deletion, or transposition.  This is a rapid algorithm for
701              correcting spelling  errors,  since  many  spelling  errors  are
702              within a Levenshtein distance of one from the original word.
703
704       prefix Prefix  match.   This algorithm also uses a binary search and is
705              very fast.
706
707       re     POSIX 1003.2 (modern) regular expression search.  Modern regular
708              expressions  are  the  ones  used  by  egrep(1).   These regular
709              expressions   allow   predefined   character   classes    (e.g.,
710              [[:alnum:]], [[:alpha:]], [[:digit:]], and [[:xdigit:]] are use‐
711              ful for this application); uses * to match a sequence 0 or  more
712              matches of the previous atom; uses + to match a sequence of 1 or
713              more matches of the previous atom; uses ? to match a sequence of
714              0  or 1 matches of the previous atom; used ^ to match the begin‐
715              ning of a word, uses $ to match the end of a  word,  and  allows
716              nested  subexpression  and alternation with () and |.  For exam‐
717              ple, "(foo|bar)" matches all words that contain either "foo"  or
718              "bar".   To  match these special characters, they must be quoted
719              with two backslashes (due to the quoting characteristics of  the
720              server).  Warning: Regular expression matches can take 10 to 300
721              times longer than substring matches.  On  a  busy  server,  with
722              many databases, this can required more than 5 minutes of waiting
723              time, depending on the complexity of the regular expression.
724
725       regexp Old (basic)  regular  expressions.   These  regular  expressions
726              don't  support  |,  +,  or  ?.   Groups use escaped parentheses.
727              While modern regular expressions are generally  easier  to  use,
728              basic  regular  expressions have a back reference feature.  This
729              can be used to match a second occurrence of something  that  was
730              already  matched.   For  example, the following expression finds
731              all words that begin and end with the same three letters:
732                  ^\\(...\\).*\\1$
733
734              Note the use of the double backslashes  to  escape  the  special
735              characters.  This is required by the DICT protocol string speci‐
736              fication (a single backslash quotes the next character -- we use
737              two  to get a single backslash through to the regular expression
738              engine).  Warning: Note that the use  of  backtracking  is  even
739              slower than the use of general regular expressions.
740
741       soundex
742              The  Soundex  algorithm,  a  classic algorithm for finding words
743              that sound similar to each other.  The  algorithm  encodes  each
744              word  using the first letter of the word and up to three digits.
745              Since the first letter is known, this search is relatively fast,
746              and  it  sometimes  good for correcting spelling errors when the
747              Levenshtein algorithm doesn't help.
748
749       substring
750              Match a substring anywhere in the headword.  This search  strat‐
751              egy  uses  a  modified Boyer-Moore-Horspool algorithm.  Since it
752              must search the whole index file, it is not as fast as the exact
753              and prefix matches.
754
755       suffix Suffix  match.  This search strategy also uses a modified Boyer-
756              Moore-Horspool algorithm,  and  is  as  fast  as  the  substring
757              search.   If  the optional index_suffix string file is listed in
758              the configuration file this search is much faster.
759
760       word   Match any single word, even if part of a multi-word  entry.   If
761              the  optional index_word string file is listed in the configura‐
762              tion file this search is much faster.
763

DATABASE FORMAT

765       Databases for dictd are distributed separately.  A database consists of
766       two files.  One is a flat text file, the other is the index.
767
768       The  flat  text file contains dictionary entries (or any other suitable
769       data), and the index contains tab-delimited tuples  consisting  of  the
770       headword,  the  byte offset at which this entry begins in the flat text
771       file, and the length of the entry in bytes.  The offset and length  are
772       encoded  using base 64 encoding using the 64-character subset of Inter‐
773       national Alphabet IA5 discussed in RFC 1421  (printable  encoding)  and
774       RFC  1522 (base64 MIME).  Encoding the offsets in base 64 saves consid‐
775       erable space when compared with the usual base 10 encoding, while still
776       permitting tab characters (ASCII 9) to be used for delimiting fields in
777       a record.  Each record ends with a newline (ASCII  10),  so  the  index
778       file is human readable.
779
780       The flat text file may be compressed using gzip(1) (not recommended) or
781       dictzip(1) (highly recommended).  Optimal speed will be obtained  using
782       an  uncompressed  file.   However, the gzip compression algorithm works
783       very well on plain text, and can  result  in  space  savings  typically
784       between 60 and 80%.  Using a file compressed with gzip(1) is not recom‐
785       mended, however, because random access on the file can only  be  accom‐
786       plished  by  serially  decompressing the whole file, a process which is
787       prohibitively slow.  dictzip(1) uses the same compression algorithm and
788       file  format  as does gzip(1), but provides a table that can be used to
789       randomly access compressed blocks in the  file.   The  use  of  50-64kB
790       blocks for compression typically degrades compression by less than 10%,
791       while maintaining acceptable random access capabilities for all data in
792       the file.  As an added benefit, files compressed with dictzip(1) can be
793       decompressed with gzip(1) or zcat(1).  (Note: recompressing a dictzip'd
794       file using, for example, znew(1) will destroy the random access charac‐
795       teristics of the file.  Always compress data files using dictzip(1).)
796
797

SIGNALS

799       SIGHUP causes dictd to reread configuration file and reinitialize data‐
800       bases.
801
802       SIGUSR1 causes dictd to unload databases. Then dictd returns 420 status
803       (instead of 220). To load databases again, send SIGHUP signal.  Because
804       database  files  are mmap'ed(2) , it is impossible to update them while
805       dictd is running.  So, if you need to update database files and  reread
806       configuration file, first, send SIGUSR1 signal to dictd to unload data‐
807       bases, update files, and then send SUGHUP signal to load them again.
808
809

ACKNOWLEDGEMENTS

811       Special thanks to Jean-loup Gailly and Mark Adler for writing the  zlib
812       general  purpose  data compression library.  The version contained with
813       dictd is not necessarily an original version and may have been modified
814       (unnecessary  files  may  have  been  deleted  to make the distribution
815       smaller; makefiles may have been  modified  to  ease  compilation;  see
816       zlib/README.DICT for any significant changes).  For more information on
817       zlib, please see the zlib home page at
818              http://www.gzip.org/zlib/
819
820       The key features of the  dictzip  random-access  compression  algorithm
821       utilize  a  documented extension of the gzip format, and do not require
822       any modifications to zlib.
823
824       Special thanks to Henry Spencer for his  regex  package.   The  package
825       contained  with  dictd  is  not necessarily an original version and may
826       have been modified (unnecessary files may have been deleted to make the
827       distribution smaller; makefiles may have been modified to ease compila‐
828       tion; see regex/README.DICT for any  significant  changes).   For  more
829       information on regex, please see
830              ftp://zoo.toronto.edu/pub/regex.shar
831

COPYING

833       The  main source files for the dictd server and the dictzip compression
834       program were written by Rik Faith (faith@dict.org) and are  distributed
835       under the terms of the GNU General Public License.  If you need to dis‐
836       tribute under other terms, write to the author.
837
838       The main libraries used by these programs  (zlib,  regex,  libmaa)  are
839       distributed  under  different  terms,  so  you  may  be able to use the
840       libraries for applications which  are  incompatible  with  the  GPL  --
841       please see the copyright notices and license information that come with
842       the libraries for more information, and consult with your  attorney  to
843       resolve these issues.
844

BUGS

846       The  regular  expression  searches  do  not ignore non-whitespace, non-
847       alphanumeric characters as do the other searches.   In  practice,  this
848       isn't much of a problem.
849
850       The 'lev' strategy doesn't work with utf8 dictionaries.
851

WARNINGS

853       Conformance  of  regular  expressions (used by 're' and 'regexp' search
854       strategies) to ERE and BRE depends on library you build dictd with.
855
856       Whether 're' and 'regex' strategies support utf8 depends on library you
857       build dictd with.
858
859

FILES

861       /etc/dictd.conf
862       /usr/sbin/dictd
863

SEE ALSO

865       dictfmt(1),   dictfmt_virtual(1),   dict(1),   dictzip(1),   gunzip(1),
866       zcat(1), webster(1), RFC 2229
867
868
869
870                                 29 March 2002                        DICTD(8)
Impressum