1DICTD(8)                                                              DICTD(8)
2
3
4

NAME

6       dictd - a dictionary database server
7

SYNOPSIS

9       dictd [options]
10

DESCRIPTION

12       dictd  is  a  server  for  the Dictionary Server Protocol (DICT), a TCP
13       transaction based query/response  protocol  that  allows  a  client  to
14       access dictionary definitions from a set of natural language dictionary
15       databases.
16
17       For security reasons, dictd drops root permissions after  startup.   If
18       user  dictd  exists  on  the  system, the daemon will run as that user,
19       group dictd, otherwise it will run as  user  nobody,  group  nobody  or
20       nogroup (depending on the operating system distribution).
21
22       Since  startup  time is significant, the server is designed to run con‐
23       tinuously, and should not be run from inetd(8).  (However, with a  fast
24       processor, it is feasible to do so.)
25
26       Databases are distributed separately from the server.
27
28       By  default,  dictd  assumes that the index files are sorted alphabeti‐
29       cally, and only alphanumeric characters from the 7-bit ASCII  character
30       set are used for search.  This default may be overridden by a header in
31       the data file.  The only such features implemented at this time are the
32       headers  "00-database-allchars" which tells dictd that non-alphanumeric
33       characters may also be used for search, the  header  "00-database-utf8"
34       which indicates that the database uses utf8 encoding, and the "00-data‐
35       base-8bit-new" which indicates that the database is encoded and  sorted
36       according to a locale that uses an 8-bit encoding.
37

BACKGROUND

39       For many years, the Internet community has relied on the "webster" pro‐
40       tocol for access to natural language definitions.  The webster protocol
41       supports  access  to  a  single dictionary and (optionally) to a single
42       thesaurus.  In recent years, the number of publicly  available  webster
43       servers on the Internet has dramatically decreased.
44
45       Fortunately,  several  freely-distributable  dictionaries  and lexicons
46       have recently become available on the Internet.  However, these freely-
47       distributable databases are not accessible via a uniform interface, and
48       are not accessible from a single site.  They are often small and incom‐
49       plete  individually,  but would collectively provide an interesting and
50       useful database of English words.  Examples include  the  Jargon  file,
51       the  WordNet  database,  MICRA's  version of the 1913 Webster's Revised
52       Unabridged Dictionary, and the Free  Online  Dictionary  of  Computing.
53       (See  the DICT protocol specification (RFC) for references.)  Translat‐
54       ing and non-English dictionaries are also becoming available (for exam‐
55       ple, the FOLDOC dictionary is being translated into Spanish).
56
57       The  webster  protocol  is not suitable for providing access to a large
58       number of separate dictionary databases, and extensions to the  current
59       webster protocol were not felt to be a clean solution to the dictionary
60       database problem.
61
62       The DICT protocol is designed to provide access to multiple  databases.
63       Word  definitions  can  be  requested,  the  word index can be searched
64       (using an easily extended set of  algorithms),  information  about  the
65       server  can  be  provided (e.g., which index search strategies are sup‐
66       ported, or which databases are  available),  and  information  about  a
67       database  can  be  provided (e.g., copyright, citation, or distribution
68       information).  Further, the DICT protocol has hooks that can be used to
69       restrict access to some or all of the databases.
70
71       dictd(8)  is  a  server that implements the DICT protocol.  Bret Martin
72       implemented another server, and  several  people  (including  Bret  and
73       myself) have implemented clients in a variety of languages.
74

OPTIONS

76       -V or --version
77              Display version information.
78
79       --license
80              Display copyright and license information.
81
82       -h or --help
83              Display help information.
84
85       -v or --verbose or  -dverbose
86              Be verbose.
87
88       -c file or --config file
89              Specify  configuration  file.   The default is /etc/dictd.conf ,
90              but  may  be  changed  in  the  defs.h  file  at  compile   time
91              (DICTD_CONFIG_FILE).
92
93       -p port or --port port
94              Overrides the keyword port in Global Settings Specification sec‐
95              tion of configuration file.
96
97       -i or --inetd
98              Communicate on standard  input/output,  suitable  for  use  from
99              inetd.   Although,  due  to  its rather large startup time, this
100              daemon was not intended to run from inetd, with a fast processor
101              it is feasible to do so. This option also implies --fast-start.
102
103       --pp prog
104              Sets  a  preprocessor for configuarion file.  like  m4 or  cpp .
105              See  examples/dictd_complex.conf  file  from  distribution.   By
106              default configuration file is parsed without preprocessor.
107
108       --depth length
109              Overrides  the  keyword  depth  in Global Settings Specification
110              section of configuration file.
111
112       --delay seconds
113              Overrides the keyword delay  in  Global  Settings  Specification
114              section of configuration file.
115
116       --facility facility
117              The  same as syslog_facility keyword in Global Settings Specifi‐
118              cation of configuration files.
119
120       -f or --force
121              Force the daemon to start even if an instance of the  daemon  is
122              already  running.  (This is of little value unless a non-default
123              port is specified with -p, since, if one instance is bound to  a
124              port, the second one fails when it can not bind to the port.)
125
126       --limit children
127              Overrides  the  keyword  limit  in Global Settings Specification
128              section of configuration file.
129
130       --listen-to address
131              Overrides the keyword listen_to in Global Settings Specification
132              section of configuration file.
133
134       --locale locale
135              Overrides  the  keyword  locale in Global Settings Specification
136              section of configuration file.
137
138       -s     The same as syslog keyword in Global Settings  Specification  of
139              configuration files.
140
141       -L file or --logfile file
142              The same as log_file keyword in Global Settings Specification of
143              configuration files.
144
145       --pid-file file
146              The same as pid_file keyword in Global Settings Specification of
147              configuration files.
148
149       -m minutes  or --mark minutes
150              Overrides the keyword timestamp in Global Settings Specification
151              section of configuration file.
152
153       --default-strategy strategy
154              Overrides the keyword default_strategy in Global Settings Speci‐
155              fication section of configuration file.
156
157       --without-strategy strat1,strat2,...
158              The same as without_strategy keyword in Global Settings Specifi‐
159              cation of configuration files.
160
161       --add-strategy strategy_name:description
162              The same as add_strategy keyword in Global  Settings  Specifica‐
163              tion of configuration files.
164
165       --fast-start
166              The  same as fast_start keyword in Global Settings Specification
167              of configuration files.
168
169       --without-mmap
170              The same as without_mmap keyword in Global  Settings  Specifica‐
171              tion of configuration files.
172
173       --stdin2stdout
174              When  applied  with --inetd, each command obtained from stdin is
175              output to stdout. This option is useful for debugging.
176
177       -l option or --log option
178              The same as log_option keyword in Global Settings  Specification
179              of configuration files.
180
181       -d option
182              The  same  as debug_option keyword in Global Settings Specifica‐
183              tion of configuration files.
184

CONFIGURATION FILE

186       Introduction
187              The configuration file defaults to /etc/dictd.conf  but  can  be
188              specified on the command line with the -c option (see above).
189
190              The  configuration  file  is read into memory at startup, and is
191              not referenced again by dictd unless  a  signal  1  (SIGHUP)  is
192              received,  which  will  cause  dictd to reread the configuration
193              file.
194
195              The file is divided into sections.  The  Access  Section  should
196              come  first, followed by the Database Section, and the User Sec‐
197              tion.   The  Database  Section  is  required;  the  others   are
198              optional, but they must be in the order listed here.
199
200       Syntax The  following  keywords  are  valid  in  a  configuration file:
201              access, allow, deny, group, database, data, index, filter,  pre‐
202              filter,  postfilter,  name, include, user, authonly, site.  Key‐
203              words are case sensitive.  String arguments that contain  spaces
204              should be surrounded by double quotes.  Without quoting, strings
205              may contain alphanumeric characters and _, -, ., and *, but  not
206              spaces.   Strings  can  be continued between lines.  \", \\, \n,
207              \<NL> are treated as double quote, backslash, new  line  and  no
208              symbol  respectively.   Comments  start with # and extend to the
209              end of the line.
210
211       Global Settings Section
212
213              global { global settings specification }
214                     Used to set global dictd setting such as log file, syslog
215                     faility, locale and so on.
216
217              EXAMPLE:
218                     See examples/dictd4.conf file from the distribution.
219
220       Access Section
221
222              access { access specification }
223                     This  section contains access restrictions for the server
224                     and all of the databases collectively.  Per-database con‐
225                     trol is specified in the Database Section.
226
227              EXAMPLE:
228                     See examples/dictd3.conf file from the distribution.
229
230       Database Section
231
232              database string { database specification }
233                     The  string  specifies the name of the database (e.g., wn
234                     or web1913).  (This is an arbitrary name selected by  the
235                     administrator, and is not necessarily related to the file
236                     name or any name listed in the data file.  A short,  easy
237                     to  type  name  is  often selected for easy use with dict
238                     -d.)
239
240                     EXAMPLE: See examples/dictd*.conf files from the  distri‐
241                     bution.
242
243                     NOTE:  If  the files specified in the database specifica‐
244                     tion do not exist on the system, dictd may silently fail.
245
246              database_virtual string { virtual database specification }
247                     This section specifies the virtual database.  The  string
248                     specifies the name of the database (e.g., en-ru or fren).
249
250                     EXAMPLE:   See   examples/dictd_virtual.conf   or   exam‐
251                     ples/dictd_complex.conf files from the distribution.
252
253              database_plugin string { plugin specification }
254                     This section specifies the plugin.  The string  specifies
255                     the name of the database.
256
257                     EXAMPLE:   See  examples/dictd_plugin_dbi.conf  or  exam‐
258                     ples/dictd_complex.conf files from the distribution.
259
260              database_mime string { mime specification }
261                     Traditionally,  databases  created  for  dictd  contained
262                     plain  text  only  because  dictd  releases before 1.10.0
263                     didn't have full support of OPTION MIME  option  (consult
264                     with RFC-2229).  This section describes the special data‐
265                     base  which  behaves  differently  depending  on  whether
266                     OPTION  MIME command was received from client or was not,
267                     i.e. the database  created  by  this  section  allows  to
268                     return  to  the  client  either a plain text or specially
269                     formatted content depending on whether DICT  client  sup‐
270                     ports  (or wants to receive) MIMEized content or doesn't.
271                     The string specifies the name of the database.
272
273                     NOTE: All this is about DEFINE command only.  MATCH, SHOW
274                     DB,  SHOW STRAT, SHOW INFO, SHOW SERVER and HELP commands
275                     return texts prepanded with empty line only.
276
277                     EXAMPLE: See examples/dictd_mime.conf file from the  dis‐
278                     tribution.
279
280              database_exit
281                     Excludes  following  databases from the '*' database.  By
282                     default '*'  means  all  databases  available.   Look  at
283                     'examples/dictd_virtual.conf' file for example configura‐
284                     tion.
285
286                     NOTE: If you use 'virtual' dictionaries, you  should  use
287                     this  directive,  otherwise you will search the same dic‐
288                     tionary twice.
289
290              User Section
291
292                     user string string
293                            The first string specifies the username,  and  the
294                            second string specifies the shared secret for this
295                            username.  When the  AUTH  command  is  used,  the
296                            client will provide the username and a hashed ver‐
297                            sion of the shared secret.  If the  shared  secret
298                            matches,  the  user is said to have authenticated,
299                            and will have access  to  databases  whose  access
300                            specifications  allow  that  user  (by name, or by
301                            wildcard).  If present, this section  must  appear
302                            last in the configuration file.  There may be many
303                            user entries.  The shared secret  should  be  kept
304                            secret,  as anyone who has access to it can access
305                            the  shared  databases  (assuming  access  is  not
306                            denied by domain name).
307
308              Access Specification
309                     Access  specifications may occur in the Access Section or
310                     in the Database Section.  The access  specification  will
311                     be described here.
312
313                     For  allow, deny, and authonly, a star (*) may be used as
314                     a wild card that matches any  number  of  characters.   A
315                     question  mark (?) may be used as a wildcard that matches
316                     a single character.  For example, 10.0.0.* and *.edu  are
317                     valid strings.
318
319                     Further,  a  range of IP addresses and an IP address fol‐
320                     lowed by  a  netmask  may  be  specified.   For  example,
321                     10.0.0.0:10.0.0.255,  10.0.0.0/24, and 10.0.0.* all spec‐
322                     ify the same range of IP  numbers.   Notation  cannot  be
323                     combined on the same line.  If the notation does not make
324                     sense, access will be denied by default.  Use the --debug
325                     auth option to debug related problems.
326
327                     Note  that  these specifications take only one string per
328                     specification line.  However, you can have multiple lines
329                     of each type.
330
331                     The syntax is as follows:
332
333                     allow string
334                            The  string  specifies a domain name or IP address
335                            which is allowed access  to  the  server  (in  the
336                            Access  Section) or to a database (in the Database
337                            Section).  Note that more than one string  is  not
338                            permitted for a single "allow" line, but more than
339                            one "allow" lines are permitted in the  configura‐
340                            tion file.
341
342                     deny string
343                            The  string  specifies a domain name or IP address
344                            which is denied  access  to  the  server  (in  the
345                            Access  Section) or to a database (in the Database
346                            Section).  Note that if reverse DNS is  not  work‐
347                            ing,  then  only  the  IP  number will be checked.
348                            Therefore, it is essential to deny networks  based
349                            on  IP number, since a denial based on domain name
350                            may not always be checked.
351
352                     authonly string
353                            This form is only useful in  the  Access  Section.
354                            The  string  specifies a domain name or IP address
355                            which is allowed access to the server but  not  to
356                            any  of  the  databases.   All  commands are valid
357                            except DEFINE, MATCH, and SHOW DB.  More  specifi‐
358                            cally  AUTH is a valid command, and commands which
359                            access the databases are not allowed.
360
361                     user string
362                            This form is only useful in the Database  Section.
363                            The string specifies a username that is allowed to
364                            access this database after a successful AUTH  com‐
365                            mand is executed.
366
367       Global Settings Specification
368              This section describes the following parameters:
369
370              port string_or_number
371                     Specifies  the  port  or  service name (e.g., 2628).  The
372                     default is 2628, as specified in the DICT  Protocol  RFC,
373                     but  may  be  changed  in the defs.h file at compile time
374                     (DICT_DEFAULT_SERVICE).
375
376              site string
377                     Used to specify the filename  for  the  site  information
378                     file,  a  flat  text  file  which  will  be  displayed in
379                     response to the SHOW SERVER command.
380
381                     EXAMPLE: See examples/dictd4.conf file from the distribu‐
382                     tion.
383
384              site_no_banner boolean
385                     By  default SHOW SERVER command outputs information about
386                     dictd version and an operating system type.  This  option
387                     disables this.
388
389              site_no_uptime boolean
390                     By  default SHOW SERVER command outputs information about
391                     uptime of dictd , a number of  forks  since  startup  and
392                     forks per hour.  This option disables this.
393
394              site_no_dblist boolean
395                     By  default SHOW SERVER command outputs internal informa‐
396                     tion about databases, such  as  a  number  of  headwords,
397                     index size and so on.  This option disables this.
398
399              delay number
400                     Specifies  the  number  of  seconds  a client may be idle
401                     before the server will close the connection.   Idle  time
402                     is defined to be the time the server is waiting for input
403                     and does not include the time the server spends searching
404                     the  database.  The  default is 0 seconds (no limit), but
405                     may be  changed  in  the  defs.h  file  at  compile  time
406                     (DICT_DEFAULT_DELAY).
407
408                     NOTE:  Setting  delay  option disables limit_time option.
409                     Only one of them (last specified in dictd.conf  )  is  in
410                     effect.
411
412                     NOTE:  Connections  are  closed  without warning since no
413                     provision for premature connection termination is  speci‐
414                     fied in the DICT protocol RFC.
415
416              depth number
417                     Specify  the  queue  length for listen(2).  Specifies the
418                     number of pending socket connections which are queued  by
419                     the   operating   system.   Some  operating  systems  may
420                     silently limit this value to 5 (older BSD systems) or 128
421                     (Linux).   The  default  is  10 but may be changed in the
422                     defs.h file at compile time (DICT_QUEUE_DEPTH).
423
424              limit_childs number
425                     Specifies the number  of  daemons  that  may  be  running
426                     simultaneously.   Each  daemon  services a single connec‐
427                     tion.  If the limit is exceeded, a  (serialized)  connec‐
428                     tion  will  be made by the server process, and a response
429                     code 420 (server temporarily unavailable) will be sent to
430                     the client.  This parameter should be adjusted to prevent
431                     the server machine from being overloaded by dict clients,
432                     but should not be set so low that many clients are denied
433                     useful connections.  The  default  is  100,  but  may  be
434                     changed  in  the  defs.h  file at compile time (DICT_DAE‐
435                     MON_LIMIT_CHILDS).
436
437              limit number
438                     Synonym for  limit_childs.   For  backward  compatibility
439                     only.
440
441              limit_matches number
442                     Specifies  the  maximum  number  of  matches  that can be
443                     returned by MATCH query. Zero means no limit. The default
444                     is 2000.
445
446              limit_definitions number
447                     Specifies  the  maximum number of definitions that can be
448                     returned by  DEFINE  query.  Zero  means  no  limit.  The
449                     default is 200.
450
451              limit_time number
452                     Specifies  the number of seconds a client may talk to the
453                     server before the server will close the connection.   The
454                     default  is  600 seconds (10 minutes), but may be changed
455                     in    the     defs.h     file     at     compile     time
456                     (DICT_DEFAULT_LIMIT_TIME).
457
458                     NOTE:  Setting  limit_time  option disables delay option.
459                     Only one of them (last specified in dictd.conf  )  is  in
460                     effect.
461
462                     NOTE:  Connections  are  closed  without warning since no
463                     provision for premature connection termination is  speci‐
464                     fied in the DICT protocol RFC.
465
466              limit_queries number
467                     Specifies  the  number of queries (MATCH, DEFINE, SHOW DB
468                     etc.)  that client may send  to  the  server  before  the
469                     server  will  close the connection.  Zero means no limit.
470                     The default is 2000, but may be  changed  in  the  defs.h
471                     file at compile time (DICT_DEFAULT_LIMIT_QUERIES).
472
473              timestamp number
474                     How  often  a  timestamp  should be logged (int minutes).
475                     (This is effective only if logging has been enabled  with
476                     the -s or -L option, or with a debugging option.)
477
478              log_option option
479                     Specify a logging option.  This is effective only if log‐
480                     ging has been enabled with the -s or -L option or in con‐
481                     figuration file, or logging to the console has been acti‐
482                     vated with a debugging option  (e.g.,  --debug  nodetach.
483                     Only  one  option may be set with each invocation of this
484                     option; however, multiple invocations of this option  may
485                     be made in configuration file or dictd command line.  For
486                     instance:
487                     dictd -s --log stats --log found --log notfound
488                     is a valid command line, and sets three logging options.
489
490                     Some of the more verbose logging options are used primar‐
491                     ily  for debugging the server code, and are not practical
492                     for normal use.
493
494                     server Log server diagnostics.  This  is  extremely  ver‐
495                            bose.
496
497                     connect
498                            Log all connections.
499
500                     stats  Log all children terminations.
501
502                     command
503                            Log all commands.  This is extremely verbose.
504
505                     client Log results of CLIENT command.
506
507                     found  Log all words found in the databases.
508
509                     notfound
510                            Log all words not found in the databases.
511
512                     timestamp
513                            When  logging to a file, use a full timestamp like
514                            that which syslog would  produce.   Otherwise,  no
515                            timestamp is made, making the files shorter.
516
517                     host   Log name of foreign host.
518
519                     auth   Log authentication failures.
520
521                     min    Set  a  minimal  number of options.  If logging is
522                            activated (to a  file,  or  via  syslog),  and  no
523                            options  are  set, then the minimal set of options
524                            will be used.  If options are set, then only those
525                            options specified will be used.
526
527                     all    Set all of the options.
528
529                     none   Clear all of the options.
530
531                     To  facilitate location of interesting information in the
532                     log file, entries are marked with initial  letters  indi‐
533                     cating the class of the line being logged:
534
535                     I      Information about the server, connections, or ter‐
536                            mination statistics.  These  lines  are  generally
537                            not designed to be parsed automatically.
538
539                     E      Error messages.
540
541                     C      CLIENT command information.
542
543                     D      Definitions found in the databases searched.
544
545                     M      Matches found in the database searched.
546
547                     N      Matches  which  were  not  found  in the databases
548                            searched.
549
550                     T      Trace of exact line sent by client.
551
552                     A      Authentication information.
553
554                     To preserve anonymity of the client, do not use the  con‐
555                     nect  or  host options.  Clients may or may not send host
556                     information using the CLIENT command, but this should  be
557                     an option that is selectable on the client side.
558
559              debug_option string
560                     Activate  a  debugging option.  There are several, all of
561                     which are only useful to developers.  They are documented
562                     here  for  completeness.  A list can be obtained interac‐
563                     tively by using -d with an illegal option.
564
565                     verbose
566                            The same as -v or --verbose.   Adds  verbosity  to
567                            other options.
568
569                     scan   Debug the scanner for the configuration file.
570
571                     parse  Debug the parser for the configuration file.
572
573                     search Debug the character folding and binary search rou‐
574                            tines.
575
576                     init   Report database initialization.
577
578                     port   Log client-side port number to the log file.
579
580                     lev    Debug Levenshtein search algorithm.
581
582                     auth   Debug the authorization routines.
583
584                     nodetach
585                            Do not detach as a  background  process.   Implies
586                            that  a  copy  of  the log file will appear on the
587                            standard output.
588
589                     nofork Do not fork daemons to  service  requests.   Be  a
590                            single-threaded server.  This option implies node‐
591                            tach, and is most useful for using a  debugger  to
592                            find the point at which daemon processes are dump‐
593                            ing core.
594
595                     alt    Debugs altcompare in index.c.
596
597              locale string
598                     Specifies the locale used for searching.  If no locale is
599                     specified,  the  "C" locale is used.  The locale used for
600                     the server should be the same as that  used  for  dictfmt
601                     when  the  database  was  built (specifically, the locale
602                     under which the index was sorted). The locale  should  be
603                     specified  for  both  8-bit  and UTF-8 formats. If locale
604                     contains  utf8  or  utf-8  substring,  UTF-8  format   is
605                     expected.  Note that if your database is not in ASCII7 or
606                     UTF-8 format, then the dictd server will not be compliant
607                     to RFC 2229.
608
609                     NOTE  If  utf-8 or 8-bit dictionaries are included in the
610                     configuration file, and the appropriate --locale has  not
611                     been  specified,  dictd will fail to start.  This implies
612                     that dictd will not run with both utf-8 and 8-bit dictio‐
613                     naries in the configuration file.
614
615              add_strategy strategy_name description
616                     Adds strategy strategy_name with the description descrip‐
617                     tion.  This new search strategy may be implemented with a
618                     help  of plugins.  Both strategy_name and description are
619                     strings.
620
621              default_strategy string
622                     Set the server's default search strategy for MATCH search
623                     type.  The compiled-in default is 'lev'.  It is also pos‐
624                     sible  to  set  default  strategy  per   database.    See
625                     default_strategy  keyword  in Database specification sec‐
626                     tion.
627
628              disable_strategy string
629                     Disable specified strategies.  By default all implemented
630                     search  strategies  are  enabled.  It is also possible to
631                     disable strategies per  database.   See  disable_strategy
632                     keyword in Database specification section.
633
634              listen_to string
635                     Binds  socket  to  the specified address.  If you want to
636                     allow connections to dict  server  from  localhost  only,
637                     apply
638                     listen_to 127.0.0.1
639
640              syslog string
641                     Log using the syslog(3) facility.
642
643              syslog_facility string
644                     Specifies  the  syslog  facility to use.  The use of this
645                     option implies the -s option to turn on logging via  sys‐
646                     log.   When  the  operating system libraries support SYS‐
647                     LOG_NAMES, the names used for this option should be those
648                     listed in syslog.conf(5).  Otherwise, the following names
649                     are used (assuming the particular facility is defined  in
650                     the  header  files):  auth,  authpriv, cron, daemon, ftp,
651                     kern,  lpr,  mail,  news,  syslog,  user,  uucp,  local0,
652                     local1,  local2,  local3,  local4,  local5,  local6,  and
653                     local7.
654
655              log_file string
656                     Specify the file for logging.  The filename specified  is
657                     recomputed  on  each use using the strftime(3) call.  For
658                     example, a filename ending in ".%Y%m%d" will write to log
659                     files  ending  in  the year, month, and date that the log
660                     entry was written.
661                     NOTE: If dictd does not have write  permission  for  this
662                     file, it will silently fail.
663
664              pid_file string
665                     The  specified  filename  will  be created to contain the
666                     process id of the main  dictd  process.  The  default  is
667                     /var/run/dictd.pid
668
669              fast_start
670                     By default, dictd creates (in memory) additional index to
671                     make the search faster.  This option disables this behav‐
672                     iour and makes startup faster.
673
674              without_mmap
675                     do  not  use  the  mmap(2) function and read entire files
676                     into memory  instead.   Use  this  option,  if  you  know
677                     exactly what you are doing.
678
679       Database Specification
680              The database specification describes the database:
681
682              data string
683                     Specifies  the  filename  for the flat text database.  If
684                     the filename does not  begin  with  '.'  or  '/',  it  is
685                     prepended  with  $datadir/.  It is a compile time option.
686                     You can change this behaviour by editing Makefile or run‐
687                     ning ./configure --datadir=...
688
689              index string
690                     Specifies  the  filename for the index file.  Path matter
691                     is similar to that described above in "data" option .
692
693              index_suffix string
694                     This is optional  index  file  to  make  'suffix'  search
695                     strategy  faster  (binary  search).   It  is generated by
696                     'dictfmt_index2suffix'. Run "dictfmt_index2suffix --help"
697                     for  more  information.   Path  matter is similar to that
698                     described above in "data" option .
699
700              index_word string
701                     This is optional index file to make 'word' search  strat‐
702                     egy   faster   (binary   search).   It  is  generated  by
703                     'dictfmt_index2word'. Run "dictfmt_index2word --help" for
704                     more   information.   Path  matter  is  similar  to  that
705                     described above in "data" option .
706
707              prefilter string
708                     Specifies the  prefilter command.  When  a chunk  of  the
709                     compressed  database  is  read, it will be filtered  with
710                     this filter before being decompressed.  This may be  used
711                     to provide  some additional compression  that knows about
712                     the data and can provide better compression than the LZ77
713                     algorithm used by zlib.
714
715              postfilter string
716                     Specifies  the  postfilter  command.  When a chunk of the
717                     compressed database is read, it  will  be  filtered  with
718                     this  filter  before  the offset and length for the entry
719                     are used to access data.  This is provided  for  symmetry
720                     with  the  prefilter  command, and may also be useful for
721                     providing additional database compression.
722
723              filter string
724                     Specifies  the  filter  command.   After  the  entry   is
725                     extracted  from  the  database,  it will be filtered with
726                     this filter.  This may be used to provide formatting  for
727                     the entry (e.g., for html).
728
729              name string
730                     Specifies  the  short  name  of the database (e.g., "1913
731                     Webster's").  If the string begins with @, then it speci‐
732                     fies  the  headword  to look up in the dictionary to find
733                     the  short  name  of  the  database.   The   default   is
734                     "@00-database-short",  but  this  may  be  changed in the
735                     defs.h file at compile time (DICT_SHORT_ENTRY_NAME).
736
737              info string
738                     Specifies the information about database.  If the  string
739                     begins  with @, then it specifies the headword to look up
740                     in the dictionary to find information.   The  default  is
741                     "@00-database-info",  but  this  may  be  changed  in the
742                     defs.h file at compile time (DICT_INFO_ENTRY_NAME).
743
744              invisible
745                     Makes dictionary invisible to the clients i.e. this  dic‐
746                     tionary will not be recognized or shown by DEFINE, MATCH,
747                     SHOW INFO, SHOW SERVER and SHOW DB commands. If some def‐
748                     initions  or  matches  are found in invisible dictionary,
749                     the name of  the  upper  visible  virtual  dictionary  is
750                     returned.  Dictionaries '*' and '!' don't include invisi‐
751                     ble ones.  NOTE: Invisible  dictionaries  are  completely
752                     inaccesible (and invisible) to the client unless they are
753                     included to the virtual or  MIME  dictionary  (See  data‐
754                     base_virtual or database_mime database sections).
755
756              disable_strategy string
757                     Disables  the  specified strategy for database.  This may
758                     be useful for slow dictionaries (plugins) or for  dictio‐
759                     naries included to virtual ones.  For an example see file
760                     examples/dictd_complex.conf.
761
762              default_strategy string
763                     Specifies the strategy which will be used if the database
764                     is  accessed using the strategy '.'.  I.e. this directive
765                     is the way to set the preferred search strategy per data‐
766                     base. For example, instead of strategy lev , the strategy
767                     word may be prefered for databases mainly containing  the
768                     multiword phrases but the single words.
769
770       Virtual Database Specification
771              The  virtual  database specification describes the virtual data‐
772              base:
773
774              database_list string
775                     Specifies a list of databases which are included into the
776                     virtual  database.   Database names are in the string and
777                     are separated by comma.
778
779              name string
780                     Specifies the short name of the  database.  See  database
781                     specification
782
783              info string
784                     Specifies  the  information  about database. See database
785                     specification
786
787              invisible
788                     Makes dictionary invisible to the clients.  See  database
789                     specification
790
791              disable_strategy string
792                     Disables  the specified strategy for database.  See data‐
793                     base specification
794
795       Plugin Specification
796
797              plugin string
798                     Specifies a filename of the plugin.
799
800              data string
801                     Specifies data for initializing plugin.
802
803              name string
804                     Specifies the short name of the database.   See  Database
805                     Specification for more information.
806
807              info string
808                     Specifies  the  information about database.  See Database
809                     Specification for more information.
810
811              invisible
812                     Makes dictionary invisible to the clients.  See  Database
813                     Specification for more information.
814
815              disable_strategy string
816                     Disables  the specified strategy for database.  See Data‐
817                     base Specification for more information.
818
819              default_strategy string
820                     Sets the default search strategy for database.  See Data‐
821                     base Specification for more information.
822
823       Mime Specification
824
825              dbname_nomime string
826                     Specifies  the  real  database name which is used in case
827                     OPTION MIME command was NOT received from a client.
828
829              dbname_mime string
830                     Specifies the real database name which is  used  in  case
831                     OPTION MIME command WAS received from a client.  A neces‐
832                     sary MIME header is set while creating a  database.   See
833                     dictfmt(1) for option --mime-header.
834
835              name string
836                     Specifies  the  short name of the database.  See Database
837                     Specification for more information.
838
839              info string
840                     Specifies the information about database.   See  Database
841                     Specification for more information.
842
843              invisible
844                     Makes  dictionary invisible to the clients.  See Database
845                     Specification for more information.
846
847              disable_strategy string
848                     Disables the specified strategy for database.  See  Data‐
849                     base Specification for more information.
850
851              default_strategy string
852                     Sets the default search strategy for database.  See Data‐
853                     base Specification for more information.
854
855       include string
856              The text of the file "string" (usually a database specification)
857              will  be read as if it appeared at this location in the configu‐
858              ration file.  Nested includes are not permitted.
859

DETERMINATION OF ACCESS LEVEL

861       When a client connects, the global access specification is scanned,  in
862       order,  until  a  specification  matches.   If  no access specification
863       exists, all access is allowed (e.g., the  action  is  the  same  as  if
864       "allow *" was the only item in the specification).  For each item, both
865       the hostname and IP are checked. For example,  consider  the  following
866       access specification:
867              allow 10.42.*
868              authonly *.edu
869              deny *
870       With  this  specification,  all  clients  in  the 10.42 network will be
871       allowed access to unrestricted databases; all clients from *.edu  sites
872       will be allowed to authenticate, but will be denied access to all data‐
873       bases, even those which  are  otherwise  unrestricted;  and  all  other
874       clients  will  have their connection terminated immediately.  The 10.42
875       network clients can send an AUTH command and gain access to  restricted
876       databases.   The *.edu clients must send an AUTH command to gain access
877       to any databases, restricted or unrestricted.
878
879       When the AUTH command is sent, the access list  for  each  database  is
880       scanned, in order, just as the global access list is scanned.  However,
881       after authentication, the client has an associated username.  For exam‐
882       ple, consider the following access specification:
883              user u1
884              deny *.com
885              user u2
886              allow *
887       If  the client authenticated as u1, then the client will have access to
888       this database, even if the client comes from a  *.com  site.   In  con‐
889       trast,  if  the  client  authenticated as u2, the client will only have
890       access if it does not come from a *.com site.  In this case, the  "user
891       u2" is redundant, since that client would also match "allow *".
892
893       Warning:  Checks  are  performed for domain names and for IP addresses.
894       However, if reverse DNS for a specific site is not working, it is  pos‐
895       sible  that a domain name may not be available for checking.  Make sure
896       that all denials use IP addresses.  (And consider a future enhancement:
897       if  a  domain  name  is  not available, should denials that depend on a
898       domain name match anything?  This is the more  conservative  viewpoint,
899       but it is not currently implemented.)
900

SEARCH ALGORITHMS

902       The DICT standard specifies a few search algorithms that must be imple‐
903       mented, and permits others to be supported on a server-dependent basis.
904       The  following  search  strategies  are supported by this server.  Note
905       that all strategies are case  insensitive.   Most  ignore  non-alphanu‐
906       meric, non-whitespace characters.
907
908       exact  An  exact match.  This algorithm uses a binary search and is one
909              of the fastest search algorithms available.
910
911       lev    The Levenshtein algorithm (string edit distance of  one).   This
912              algorithm  searches  for all words which are within an edit dis‐
913              tance of one from the target word.  An "edit"  means  an  inser‐
914              tion, deletion, or transposition.  This is a rapid algorithm for
915              correcting spelling  errors,  since  many  spelling  errors  are
916              within a Levenshtein distance of one from the original word.
917
918       prefix Prefix  match.   This algorithm also uses a binary search and is
919              very fast.
920
921       nprefix
922              Like prefix but returns the  specified  range  of  matches.  For
923              example,  when prefix strategy returns 1000 matches, you can get
924              only 100 ones skipping the first 800 matches.  This is  made  by
925              specified  these limits in a query like this: 800#100#app, where
926              800 is skip count, 100 is a number of matches you  want  to  get
927              and "app" is your query.  This strategy allows to implement DICT
928              client with fast autocompletion (although  it  is  not  trivial)
929              just like many standalone dictionary programs do.
930
931              NOTE:  If  you  access  the dictionary "*" (or virtual one) with
932              nprefix strategy, the same range is set for each database in it,
933              but globally for all matches found in all databases.
934
935              NOTE:  In  case  you  access non-english dictionary the returned
936              matches may be (and mostly will be) NOT  ordered  in  alphabetic
937              order.
938
939       re     POSIX 1003.2 (modern) regular expression search.  Modern regular
940              expressions are  the  ones  used  by  egrep(1).   These  regular
941              expressions    allow   predefined   character   classes   (e.g.,
942              [[:alnum:]], [[:alpha:]], [[:digit:]], and [[:xdigit:]] are use‐
943              ful  for this application); uses * to match a sequence 0 or more
944              matches of the previous atom; uses + to match a sequence of 1 or
945              more matches of the previous atom; uses ? to match a sequence of
946              0 or 1 matches of the previous atom; used ^ to match the  begin‐
947              ning  of  a  word, uses $ to match the end of a word, and allows
948              nested subexpression and alternation with () and |.   For  exam‐
949              ple,  "(foo|bar)" matches all words that contain either "foo" or
950              "bar".  To match these special characters, they must  be  quoted
951              with  two backslashes (due to the quoting characteristics of the
952              server).  Warning: Regular expression matches can take 10 to 300
953              times  longer  than  substring  matches.  On a busy server, with
954              many databases, this can required more than 5 minutes of waiting
955              time, depending on the complexity of the regular expression.
956
957       regexp Old  (basic)  regular  expressions.   These  regular expressions
958              don't support |, +,  or  ?.   Groups  use  escaped  parentheses.
959              While  modern  regular  expressions are generally easier to use,
960              basic regular expressions have a back reference  feature.   This
961              can  be  used to match a second occurrence of something that was
962              already matched.  For example, the  following  expression  finds
963              all words that begin and end with the same three letters:
964                  ^\\(...\\).*\\1$
965
966              Note  the  use  of  the double backslashes to escape the special
967              characters.  This is required by the DICT protocol string speci‐
968              fication (a single backslash quotes the next character -- we use
969              two to get a single backslash through to the regular  expression
970              engine).   Warning:  Note  that  the use of backtracking is even
971              slower than the use of general regular expressions.
972
973       soundex
974              The Soundex algorithm, a classic  algorithm  for  finding  words
975              that  sound  similar  to each other.  The algorithm encodes each
976              word using the first letter of the word and up to three  digits.
977              Since the first letter is known, this search is relatively fast,
978              and it sometimes good for correcting spelling  errors  when  the
979              Levenshtein algorithm doesn't help.
980
981       substring
982              Match  a substring anywhere in the headword.  This search strat‐
983              egy uses a modified Boyer-Moore-Horspool  algorithm.   Since  it
984              must search the whole index file, it is not as fast as the exact
985              and prefix matches.
986
987       suffix Suffix match.  This search strategy also uses a modified  Boyer-
988              Moore-Horspool  algorithm,  and  is  as  fast  as  the substring
989              search.  If the optional index_suffix string file is  listed  in
990              the configuration file this search is much faster.
991
992       word   Match  any  single word, even if part of a multi-word entry.  If
993              the optional index_word string file is listed in the  configura‐
994              tion file this search strategy works much faster.
995
996       first  Match the first word that begins a multi-word entry.
997
998       last   Match  the  last  word  that  ends  a  multi-word entry.  If the
999              optional index_suffix string file is listed in the configuration
1000              file this search strategy works much faster.
1001

DATABASE FORMAT

1003       Databases for dictd are distributed separately.  A database consists of
1004       two files.  One is a flat text file, the other is the index.
1005
1006       The flat text file contains dictionary entries (or any  other  suitable
1007       data),  and  the  index contains tab-delimited tuples consisting of the
1008       headword, the byte offset at which this entry begins in the  flat  text
1009       file,  and the length of the entry in bytes.  The offset and length are
1010       encoded using base 64 encoding using the 64-character subset of  Inter‐
1011       national  Alphabet  IA5  discussed in RFC 1421 (printable encoding) and
1012       RFC 1522 (base64 MIME).  Encoding the offsets in base 64 saves  consid‐
1013       erable space when compared with the usual base 10 encoding, while still
1014       permitting tab characters (ASCII 9) to be used for delimiting fields in
1015       a  record.   Each  record  ends with a newline (ASCII 10), so the index
1016       file is human readable.
1017
1018       Some headwords are used by dictd especially
1019
1020       00-database-info Containts the  informarion  about  database  which  is
1021       returned by SHOW INFO command, unless it is specified in the configura‐
1022       tion file.
1023
1024       00-database-short Containts the short name of  the  database  which  is
1025       returned  by  SHOW DB command, unless it is specified in the configura‐
1026       tion file.  See dictfmt -s.
1027
1028       00-database-url URL where original  dictionary  sources  were  obtained
1029       from.  See dictfmt -u.  This headword is not used by dictd
1030
1031       00-database-utf8  Presents  if  dictionary is encoded using UTF-8.  See
1032       dictfmt --utf8
1033
1034       00-database-8bit-new Presents if  dictionary  is  encoded  using  8-BIT
1035       character set (not ASCII and not UTF8).  See dictfmt --locale.
1036
1037       The flat text file may be compressed using gzip(1) (not recommended) or
1038       dictzip(1) (highly recommended).  Optimal speed will be obtained  using
1039       an  uncompressed  file.   However, the gzip compression algorithm works
1040       very well on plain text, and can  result  in  space  savings  typically
1041       between 60 and 80%.  Using a file compressed with gzip(1) is not recom‐
1042       mended, however, because random access on the file can only  be  accom‐
1043       plished  by  serially  decompressing the whole file, a process which is
1044       prohibitively slow.  dictzip(1) uses the same compression algorithm and
1045       file  format  as does gzip(1), but provides a table that can be used to
1046       randomly access compressed blocks in the  file.   The  use  of  50-64kB
1047       blocks for compression typically degrades compression by less than 10%,
1048       while maintaining acceptable random access capabilities for all data in
1049       the file.  As an added benefit, files compressed with dictzip(1) can be
1050       decompressed with gzip(1) or zcat(1).  (Note: recompressing a dictzip'd
1051       file using, for example, znew(1) will destroy the random access charac‐
1052       teristics of the file.  Always compress data files using dictzip(1).)
1053

SIGNALS

1055       SIGHUP causes dictd to reread configuration file and reinitialize data‐
1056       bases.
1057
1058       SIGUSR1 causes dictd to unload databases. Then dictd returns 420 status
1059       (instead of 220). To load databases again, send SIGHUP signal.  Because
1060       database  files  are mmap'ed(2) , it is impossible to update them while
1061       dictd is running.  So, if you need to update database files and  reread
1062       configuration file, first, send SIGUSR1 signal to dictd to unload data‐
1063       bases, update files, and then send SUGHUP signal to load them again.
1064

COPYING

1066       The main source files for the dictd server and the dictzip  compression
1067       program  were written by Rik Faith (faith@dict.org) and are distributed
1068       under the terms of the GNU General Public License.  If you need to dis‐
1069       tribute under other terms, write to the author.
1070
1071       The  main  libraries  used  by these programs (zlib, regex, libmaa) are
1072       distributed under different terms, so  you  may  be  able  to  use  the
1073       libraries  for  applications  which  are  incompatible  with the GPL --
1074       please see the copyright notices and license information that come with
1075       the  libraries  for more information, and consult with your attorney to
1076       resolve these issues.
1077

BUGS

1079       The regular expression searches  do  not  ignore  non-whitespace,  non-
1080       alphanumeric  characters  as  do the other searches.  In practice, this
1081       isn't much of a problem.
1082

WARNINGS

1084       Conformance of regular expressions (used by 're'  and  'regexp'  search
1085       strategies)  to  ERE  and  BRE depends on library you build dictd with.
1086       Whether 're' and 'regex' strategies support utf8 depends on library you
1087       build dictd with.
1088

FILES

1090       /etc/dictd.conf
1091              dictd configuration file
1092
1093       /usr/sbin/dictd
1094              dictd daemon itself
1095
1096       /var/run/dictd.pid
1097              File for storing pid of dictd daemon
1098
1099       /usr/share
1100              The default directory for dictd databases (.index and .dict[.dz]
1101              files)
1102

SEE ALSO

1104       examples/dictd*.conf,  dictfmt(1),  dict(1),   dictzip(1),   gunzip(1),
1105       zcat(1), webster(1), RFC 2229
1106
1107
1108
1109                                 29 March 2002                        DICTD(8)
Impressum