1DICTD(8) DICTD(8)
2
3
4
6 dictd - a dictionary database server
7
9 dictd [options]
10
12 dictd is a server for the Dictionary Server Protocol (DICT), a TCP
13 transaction based query/response protocol that allows a client to
14 access dictionary definitions from a set of natural language dictionary
15 databases.
16
17 For security reasons, dictd drops root permissions after startup. If
18 user dictd exists on the system, the daemon will run as that user,
19 group dictd, otherwise it will run as user nobody, group nobody or
20 nogroup (depending on the operating system distribution).
21
22 Since startup time is significant, the server is designed to run con‐
23 tinuously, and should not be run from inetd(8). (However, with a fast
24 processor, it is feasible to do so.)
25
26 Databases are distributed separately from the server.
27
28 By default, dictd assumes that the index files are sorted alphabeti‐
29 cally, and only alphanumeric characters from the 7-bit ASCII character
30 set are used for search. This default may be overridden by a header in
31 the data file. The only such features implemented at this time are the
32 headers "00-database-allchars" which tells dictd that non-alphanumeric
33 characters may also be used for search, the header "00-database-utf8"
34 which indicates that the database uses utf8 encoding, and the "00-data‐
35 base-8bit-new" which indicates that the database is encoded and sorted
36 according to a locale that uses an 8-bit encoding.
37
39 For many years, the Internet community has relied on the "webster" pro‐
40 tocol for access to natural language definitions. The webster protocol
41 supports access to a single dictionary and (optionally) to a single
42 thesaurus. In recent years, the number of publicly available webster
43 servers on the Internet has dramatically decreased.
44
45 Fortunately, several freely-distributable dictionaries and lexicons
46 have recently become available on the Internet. However, these freely-
47 distributable databases are not accessible via a uniform interface, and
48 are not accessible from a single site. They are often small and incom‐
49 plete individually, but would collectively provide an interesting and
50 useful database of English words. Examples include the Jargon file,
51 the WordNet database, MICRA's version of the 1913 Webster's Revised
52 Unabridged Dictionary, and the Free Online Dictionary of Computing.
53 (See the DICT protocol specification (RFC) for references.) Translat‐
54 ing and non-English dictionaries are also becoming available (for exam‐
55 ple, the FOLDOC dictionary is being translated into Spanish).
56
57 The webster protocol is not suitable for providing access to a large
58 number of separate dictionary databases, and extensions to the current
59 webster protocol were not felt to be a clean solution to the dictionary
60 database problem.
61
62 The DICT protocol is designed to provide access to multiple databases.
63 Word definitions can be requested, the word index can be searched
64 (using an easily extended set of algorithms), information about the
65 server can be provided (e.g., which index search strategies are sup‐
66 ported, or which databases are available), and information about a
67 database can be provided (e.g., copyright, citation, or distribution
68 information). Further, the DICT protocol has hooks that can be used to
69 restrict access to some or all of the databases.
70
71 dictd(8) is a server that implements the DICT protocol. Bret Martin
72 implemented another server, and several people (including Bret and
73 myself) have implemented clients in a variety of languages.
74
76 -V or --version
77 Display version information.
78
79 --license
80 Display copyright and license information.
81
82 -h or --help
83 Display help information.
84
85 -v or --verbose or -dverbose
86 Be verbose.
87
88 -c file or --config file
89 Specify configuration file. The default is /etc/dictd.conf ,
90 but may be changed in the defs.h file at compile time
91 (DICTD_CONFIG_FILE).
92
93 -p port or --port port
94 Overrides the keyword port in Global Settings Specification sec‐
95 tion of configuration file.
96
97 -i or --inetd
98 Communicate on standard input/output, suitable for use from
99 inetd. Although, due to its rather large startup time, this
100 daemon was not intended to run from inetd, with a fast processor
101 it is feasible to do so. This option also implies --fast-start.
102
103 --pp prog
104 Sets a preprocessor for configuration file. like m4 or cpp .
105 See examples/dictd_complex.conf file from distribution. By
106 default configuration file is parsed without preprocessor.
107
108 --depth length
109 Overrides the keyword depth in Global Settings Specification
110 section of configuration file.
111
112 --delay seconds
113 Overrides the keyword delay in Global Settings Specification
114 section of configuration file.
115
116 --facility facility
117 The same as syslog_facility keyword in Global Settings Specifi‐
118 cation of configuration files.
119
120 -f or --force
121 Force the daemon to start even if an instance of the daemon is
122 already running. (This is of little value unless a non-default
123 port is specified with -p, since, if one instance is bound to a
124 port, the second one fails when it can not bind to the port.)
125
126 --limit children
127 Overrides the keyword limit in Global Settings Specification
128 section of configuration file.
129
130 --listen-to address
131 Overrides the keyword listen_to in Global Settings Specification
132 section of configuration file.
133
134 --locale locale
135 Overrides the keyword locale in Global Settings Specification
136 section of configuration file.
137
138 -s The same as syslog keyword in Global Settings Specification of
139 configuration files.
140
141 -L file or --logfile file
142 The same as log_file keyword in Global Settings Specification of
143 configuration files.
144
145 --pid-file file
146 The same as pid_file keyword in Global Settings Specification of
147 configuration files.
148
149 -m minutes or --mark minutes
150 Overrides the keyword timestamp in Global Settings Specification
151 section of configuration file.
152
153 --default-strategy strategy
154 Overrides the keyword default_strategy in Global Settings Speci‐
155 fication section of configuration file.
156
157 --without-strategy strat1,strat2,...
158 The same as without_strategy keyword in Global Settings Specifi‐
159 cation of configuration files.
160
161 --add-strategy strategy_name:description
162 The same as add_strategy keyword in Global Settings Specifica‐
163 tion of configuration files.
164
165 --fast-start
166 The same as fast_start keyword in Global Settings Specification
167 of configuration files.
168
169 --without-mmap
170 The same as without_mmap keyword in Global Settings Specifica‐
171 tion of configuration files.
172
173 --stdin2stdout
174 When applied with --inetd, each command obtained from stdin is
175 output to stdout. This option is useful for debugging.
176
177 -l option or --log option
178 The same as log_option keyword in Global Settings Specification
179 of configuration files.
180
181 -d option
182 The same as debug_option keyword in Global Settings Specifica‐
183 tion of configuration files.
184
186 Introduction
187 The configuration file defaults to /etc/dictd.conf but can be
188 specified on the command line with the -c option (see above).
189
190 The configuration file is read into memory at startup, and is
191 not referenced again by dictd unless a signal 1 (SIGHUP) is
192 received, which will cause dictd to reread the configuration
193 file.
194
195 The file is divided into sections. The Access Section should
196 come first, followed by the Database Section, and the User Sec‐
197 tion. The Database Section is required; the others are
198 optional, but they must be in the order listed here.
199
200 Syntax The following keywords are valid in a configuration file:
201 access, allow, deny, group, database, data, index, filter, pre‐
202 filter, postfilter, name, include, user, authonly, site. Key‐
203 words are case sensitive. String arguments that contain spaces
204 should be surrounded by double quotes. Without quoting, strings
205 may contain alphanumeric characters and _, -, ., and *, but not
206 spaces. Strings can be continued between lines. \", \\, \n,
207 \<NL> are treated as double quote, backslash, new line and no
208 symbol respectively. Comments start with # and extend to the
209 end of the line.
210
211 Global Settings Section
212
213 global { global settings specification }
214 Used to set global dictd setting such as log file, syslog
215 facility, locale and so on.
216
217 EXAMPLE:
218 See examples/dictd4.conf file from the distribution.
219
220 Access Section
221
222 access { access specification }
223 This section contains access restrictions for the server
224 and all of the databases collectively. Per-database con‐
225 trol is specified in the Database Section.
226
227 EXAMPLE:
228 See examples/dictd3.conf file from the distribution.
229
230 Database Section
231
232 database string { database specification }
233 The string specifies the name of the database (e.g., wn
234 or web1913). (This is an arbitrary name selected by the
235 administrator, and is not necessarily related to the file
236 name or any name listed in the data file. A short, easy
237 to type name is often selected for easy use with dict
238 -d.)
239
240 EXAMPLE: See examples/dictd*.conf files from the distri‐
241 bution.
242
243 NOTE: If the files specified in the database specifica‐
244 tion do not exist on the system, dictd may silently fail.
245
246 database_virtual string { virtual database specification }
247 This section specifies the virtual database. The string
248 specifies the name of the database (e.g., en-ru or fren).
249
250 EXAMPLE: See examples/dictd_virtual.conf or exam‐
251 ples/dictd_complex.conf files from the distribution.
252
253 database_plugin string { plugin specification }
254 This section specifies the plugin. The string specifies
255 the name of the database.
256
257 EXAMPLE: See examples/dictd_plugin_dbi.conf or exam‐
258 ples/dictd_complex.conf files from the distribution.
259
260 database_mime string { mime specification }
261 Traditionally, databases created for dictd contained
262 plain text only because dictd releases before 1.10.0
263 didn't have full support of OPTION MIME option (consult
264 with RFC-2229). This section describes the special data‐
265 base which behaves differently depending on whether
266 OPTION MIME command was received from client or was not,
267 i.e. the database created by this section allows to
268 return to the client either a plain text or specially
269 formatted content depending on whether DICT client sup‐
270 ports (or wants to receive) MIMEized content or doesn't.
271 The string specifies the name of the database.
272
273 NOTE: All this is about DEFINE command only. MATCH, SHOW
274 DB, SHOW STRAT, SHOW INFO, SHOW SERVER and HELP commands
275 return texts prepended with empty line only.
276
277 EXAMPLE: See examples/dictd_mime.conf file from the dis‐
278 tribution.
279
280 database_exit
281 Excludes following databases from the '*' database. By
282 default '*' means all databases available. Look at
283 'examples/dictd_virtual.conf' file for example configura‐
284 tion.
285
286 NOTE: If you use 'virtual' dictionaries, you should use
287 this directive, otherwise you will search the same dic‐
288 tionary twice.
289
290 User Section
291
292 user string string
293 The first string specifies the username, and the
294 second string specifies the shared secret for this
295 username. When the AUTH command is used, the
296 client will provide the username and a hashed ver‐
297 sion of the shared secret. If the shared secret
298 matches, the user is said to have authenticated,
299 and will have access to databases whose access
300 specifications allow that user (by name, or by
301 wildcard). If present, this section must appear
302 last in the configuration file. There may be many
303 user entries. The shared secret should be kept
304 secret, as anyone who has access to it can access
305 the shared databases (assuming access is not
306 denied by domain name).
307
308 Access Specification
309 Access specifications may occur in the Access Section or
310 in the Database Section. The access specification will
311 be described here.
312
313 For allow, deny, and authonly, a star (*) may be used as
314 a wild card that matches any number of characters. A
315 question mark (?) may be used as a wildcard that matches
316 a single character. For example, 10.0.0.* and *.edu are
317 valid strings.
318
319 Further, a range of IP addresses and an IP address fol‐
320 lowed by a netmask may be specified. For example,
321 10.0.0.0:10.0.0.255, 10.0.0.0/24, and 10.0.0.* all spec‐
322 ify the same range of IP numbers. Notation cannot be
323 combined on the same line. If the notation does not make
324 sense, access will be denied by default. Use the --debug
325 auth option to debug related problems.
326
327 Note that these specifications take only one string per
328 specification line. However, you can have multiple lines
329 of each type.
330
331 The syntax is as follows:
332
333 allow string
334 The string specifies a domain name or IP address
335 which is allowed access to the server (in the
336 Access Section) or to a database (in the Database
337 Section). Note that more than one string is not
338 permitted for a single "allow" line, but more than
339 one "allow" lines are permitted in the configura‐
340 tion file.
341
342 deny string
343 The string specifies a domain name or IP address
344 which is denied access to the server (in the
345 Access Section) or to a database (in the Database
346 Section). Note that if reverse DNS is not work‐
347 ing, then only the IP number will be checked.
348 Therefore, it is essential to deny networks based
349 on IP number, since a denial based on domain name
350 may not always be checked.
351
352 authonly string
353 This form is only useful in the Access Section.
354 The string specifies a domain name or IP address
355 which is allowed access to the server but not to
356 any of the databases. All commands are valid
357 except DEFINE, MATCH, and SHOW DB. More specifi‐
358 cally AUTH is a valid command, and commands which
359 access the databases are not allowed.
360
361 user string
362 This form is only useful in the Database Section.
363 The string specifies a username that is allowed to
364 access this database after a successful AUTH com‐
365 mand is executed.
366
367 Global Settings Specification
368 This section describes the following parameters:
369
370 port string_or_number
371 Specifies the port or service name (e.g., 2628). The
372 default is 2628, as specified in the DICT Protocol RFC,
373 but may be changed in the defs.h file at compile time
374 (DICT_DEFAULT_SERVICE).
375
376 site string
377 Used to specify the filename for the site information
378 file, a flat text file which will be displayed in
379 response to the SHOW SERVER command.
380
381 EXAMPLE: See examples/dictd4.conf file from the distribu‐
382 tion.
383
384 site_no_banner boolean
385 By default SHOW SERVER command outputs information about
386 dictd version and an operating system type. This option
387 disables this.
388
389 site_no_uptime boolean
390 By default SHOW SERVER command outputs information about
391 uptime of dictd , a number of forks since startup and
392 forks per hour. This option disables this.
393
394 site_no_dblist boolean
395 By default SHOW SERVER command outputs internal informa‐
396 tion about databases, such as a number of headwords,
397 index size and so on. This option disables this.
398
399 delay number
400 Specifies the number of seconds a client may be idle
401 before the server will close the connection. Idle time
402 is defined to be the time the server is waiting for input
403 and does not include the time the server spends searching
404 the database. The default is 0 seconds (no limit), but
405 may be changed in the defs.h file at compile time
406 (DICT_DEFAULT_DELAY).
407
408 NOTE: Setting delay option disables limit_time option.
409 Only one of them (last specified in dictd.conf ) is in
410 effect.
411
412 NOTE: Connections are closed without warning since no
413 provision for premature connection termination is speci‐
414 fied in the DICT protocol RFC.
415
416 depth number
417 Specify the queue length for listen(2). Specifies the
418 number of pending socket connections which are queued by
419 the operating system. Some operating systems may
420 silently limit this value to 5 (older BSD systems) or 128
421 (Linux). The default is 10 but may be changed in the
422 defs.h file at compile time (DICT_QUEUE_DEPTH).
423
424 limit_childs number
425 Specifies the number of daemons that may be running
426 simultaneously. Each daemon services a single connec‐
427 tion. If the limit is exceeded, a (serialized) connec‐
428 tion will be made by the server process, and a response
429 code 420 (server temporarily unavailable) will be sent to
430 the client. This parameter should be adjusted to prevent
431 the server machine from being overloaded by dict clients,
432 but should not be set so low that many clients are denied
433 useful connections. The default is 100, but may be
434 changed in the defs.h file at compile time (DICT_DAE‐
435 MON_LIMIT_CHILDS).
436
437 limit number
438 Synonym for limit_childs. For backward compatibility
439 only.
440
441 limit_matches number
442 Specifies the maximum number of matches that can be
443 returned by MATCH query. Zero means no limit. The default
444 is 2000.
445
446 limit_definitions number
447 Specifies the maximum number of definitions that can be
448 returned by DEFINE query. Zero means no limit. The
449 default is 200.
450
451 limit_time number
452 Specifies the number of seconds a client may talk to the
453 server before the server will close the connection. The
454 default is 600 seconds (10 minutes), but may be changed
455 in the defs.h file at compile time
456 (DICT_DEFAULT_LIMIT_TIME).
457
458 NOTE: Setting limit_time option disables delay option.
459 Only one of them (last specified in dictd.conf ) is in
460 effect.
461
462 NOTE: Connections are closed without warning since no
463 provision for premature connection termination is speci‐
464 fied in the DICT protocol RFC.
465
466 limit_queries number
467 Specifies the number of queries (MATCH, DEFINE, SHOW DB
468 etc.) that client may send to the server before the
469 server will close the connection. Zero means no limit.
470 The default is 2000, but may be changed in the defs.h
471 file at compile time (DICT_DEFAULT_LIMIT_QUERIES).
472
473 timestamp number
474 How often a timestamp should be logged (int minutes).
475 (This is effective only if logging has been enabled with
476 the -s or -L option, or with a debugging option.)
477
478 log_option option
479 Specify a logging option. This is effective only if log‐
480 ging has been enabled with the -s or -L option or in con‐
481 figuration file, or logging to the console has been acti‐
482 vated with a debugging option (e.g., --debug nodetach.
483 Only one option may be set with each invocation of this
484 option; however, multiple invocations of this option may
485 be made in configuration file or dictd command line. For
486 instance:
487 dictd -s --log stats --log found --log notfound
488 is a valid command line, and sets three logging options.
489
490 Some of the more verbose logging options are used primar‐
491 ily for debugging the server code, and are not practical
492 for normal use.
493
494 server Log server diagnostics. This is extremely ver‐
495 bose.
496
497 connect
498 Log all connections.
499
500 stats Log all children terminations.
501
502 command
503 Log all commands. This is extremely verbose.
504
505 client Log results of CLIENT command.
506
507 found Log all words found in the databases.
508
509 notfound
510 Log all words not found in the databases.
511
512 timestamp
513 When logging to a file, use a full timestamp like
514 that which syslog would produce. Otherwise, no
515 timestamp is made, making the files shorter.
516
517 host Log name of foreign host.
518
519 auth Log authentication failures.
520
521 min Set a minimal number of options. If logging is
522 activated (to a file, or via syslog), and no
523 options are set, then the minimal set of options
524 will be used. If options are set, then only those
525 options specified will be used.
526
527 all Set all of the options.
528
529 none Clear all of the options.
530
531 To facilitate location of interesting information in the
532 log file, entries are marked with initial letters indi‐
533 cating the class of the line being logged:
534
535 I Information about the server, connections, or ter‐
536 mination statistics. These lines are generally
537 not designed to be parsed automatically.
538
539 E Error messages.
540
541 C CLIENT command information.
542
543 D Definitions found in the databases searched.
544
545 M Matches found in the database searched.
546
547 N Matches which were not found in the databases
548 searched.
549
550 T Trace of exact line sent by client.
551
552 A Authentication information.
553
554 To preserve anonymity of the client, do not use the con‐
555 nect or host options. Clients may or may not send host
556 information using the CLIENT command, but this should be
557 an option that is selectable on the client side.
558
559 debug_option string
560 Activate a debugging option. There are several, all of
561 which are only useful to developers. They are documented
562 here for completeness. A list can be obtained interac‐
563 tively by using -d with an illegal option.
564
565 verbose
566 The same as -v or --verbose. Adds verbosity to
567 other options.
568
569 scan Debug the scanner for the configuration file.
570
571 parse Debug the parser for the configuration file.
572
573 search Debug the character folding and binary search rou‐
574 tines.
575
576 init Report database initialization.
577
578 port Log client-side port number to the log file.
579
580 lev Debug Levenshtein search algorithm.
581
582 auth Debug the authorization routines.
583
584 nodetach
585 Do not detach as a background process. Implies
586 that a copy of the log file will appear on the
587 standard output.
588
589 nofork Do not fork daemons to service requests. Be a
590 single-threaded server. This option implies node‐
591 tach, and is most useful for using a debugger to
592 find the point at which daemon processes are dump‐
593 ing core.
594
595 alt Debugs altcompare in index.c.
596
597 locale string
598 Specifies the locale used for searching. If no locale is
599 specified, the "C" locale is used. The locale used for
600 the server should be the same as that used for dictfmt
601 when the database was built (specifically, the locale
602 under which the index was sorted). The locale should be
603 specified for both 8-bit and UTF-8 formats. If locale
604 contains utf8 or utf-8 substring, UTF-8 format is
605 expected. Note that if your database is not in ASCII7 or
606 UTF-8 format, then the dictd server will not be compliant
607 to RFC 2229.
608
609 NOTE If utf-8 or 8-bit dictionaries are included in the
610 configuration file, and the appropriate --locale has not
611 been specified, dictd will fail to start. This implies
612 that dictd will not run with both utf-8 and 8-bit dictio‐
613 naries in the configuration file.
614
615 add_strategy strategy_name description
616 Adds strategy strategy_name with the description descrip‐
617 tion. This new search strategy may be implemented with a
618 help of plugins. Both strategy_name and description are
619 strings.
620
621 default_strategy string
622 Set the server's default search strategy for MATCH search
623 type. The compiled-in default is 'lev'. It is also pos‐
624 sible to set default strategy per database. See
625 default_strategy keyword in Database specification sec‐
626 tion.
627
628 disable_strategy string
629 Disable specified strategies. By default all implemented
630 search strategies are enabled. It is also possible to
631 disable strategies per database. See disable_strategy
632 keyword in Database specification section.
633
634 listen_to string
635 Binds socket to the specified address. If you want to
636 allow connections to dict server from localhost only,
637 apply
638 listen_to 127.0.0.1
639
640 syslog string
641 Log using the syslog(3) facility.
642
643 syslog_facility string
644 Specifies the syslog facility to use. The use of this
645 option implies the -s option to turn on logging via sys‐
646 log. When the operating system libraries support SYS‐
647 LOG_NAMES, the names used for this option should be those
648 listed in syslog.conf(5). Otherwise, the following names
649 are used (assuming the particular facility is defined in
650 the header files): auth, authpriv, cron, daemon, ftp,
651 kern, lpr, mail, news, syslog, user, uucp, local0,
652 local1, local2, local3, local4, local5, local6, and
653 local7.
654
655 log_file string
656 Specify the file for logging. The filename specified is
657 recomputed on each use using the strftime(3) call. For
658 example, a filename ending in ".%Y%m%d" will write to log
659 files ending in the year, month, and date that the log
660 entry was written.
661 NOTE: If dictd does not have write permission for this
662 file, it will silently fail.
663
664 pid_file string
665 The specified filename will be created to contain the
666 process id of the main dictd process. The default is
667 /var/run/dictd.pid
668
669 fast_start
670 By default, dictd creates (in memory) additional index to
671 make the search faster. This option disables this behav‐
672 iour and makes startup faster.
673
674 without_mmap
675 do not use the mmap(2) function and read entire files
676 into memory instead. Use this option, if you know
677 exactly what you are doing.
678
679 Database Specification
680 The database specification describes the database:
681
682 data string
683 Specifies the filename for the flat text database. If
684 the filename does not begin with '.' or '/', it is
685 prepended with $datadir/. It is a compile time option.
686 You can change this behaviour by editing Makefile or run‐
687 ning ./configure --datadir=...
688
689 index string
690 Specifies the filename for the index file. Path matter
691 is similar to that described above in "data" option .
692
693 index_suffix string
694 This is optional index file to make 'suffix' search
695 strategy faster (binary search). It is generated by
696 'dictfmt_index2suffix'. Run "dictfmt_index2suffix --help"
697 for more information. Path matter is similar to that
698 described above in "data" option .
699
700 index_word string
701 This is optional index file to make 'word' search strat‐
702 egy faster (binary search). It is generated by
703 'dictfmt_index2word'. Run "dictfmt_index2word --help" for
704 more information. Path matter is similar to that
705 described above in "data" option .
706
707 prefilter string
708 Specifies the prefilter command. When a chunk of the
709 compressed database is read, it will be filtered with
710 this filter before being decompressed. This may be used
711 to provide some additional compression that knows about
712 the data and can provide better compression than the LZ77
713 algorithm used by zlib.
714
715 postfilter string
716 Specifies the postfilter command. When a chunk of the
717 compressed database is read, it will be filtered with
718 this filter before the offset and length for the entry
719 are used to access data. This is provided for symmetry
720 with the prefilter command, and may also be useful for
721 providing additional database compression.
722
723 filter string
724 Specifies the filter command. After the entry is
725 extracted from the database, it will be filtered with
726 this filter. This may be used to provide formatting for
727 the entry (e.g., for html).
728
729 name string
730 Specifies the short name of the database (e.g., "1913
731 Webster's"). If the string begins with @, then it speci‐
732 fies the headword to look up in the dictionary to find
733 the short name of the database. The default is
734 "@00-database-short", but this may be changed in the
735 defs.h file at compile time (DICT_SHORT_ENTRY_NAME).
736
737 info string
738 Specifies the information about database. If the string
739 begins with @, then it specifies the headword to look up
740 in the dictionary to find information. The default is
741 "@00-database-info", but this may be changed in the
742 defs.h file at compile time (DICT_INFO_ENTRY_NAME).
743
744 invisible
745 Makes dictionary invisible to the clients i.e. this dic‐
746 tionary will not be recognized or shown by DEFINE, MATCH,
747 SHOW INFO, SHOW SERVER and SHOW DB commands. If some def‐
748 initions or matches are found in invisible dictionary,
749 the name of the upper visible virtual dictionary is
750 returned. Dictionaries '*' and '!' don't include invisi‐
751 ble ones. NOTE: Invisible dictionaries are completely
752 inaccessible (and invisible) to the client unless they
753 are included to the virtual or MIME dictionary (See data‐
754 base_virtual or database_mime database sections).
755
756 disable_strategy string
757 Disables the specified strategy for database. This may
758 be useful for slow dictionaries (plugins) or for dictio‐
759 naries included to virtual ones. For an example see file
760 examples/dictd_complex.conf.
761
762 default_strategy string
763 Specifies the strategy which will be used if the database
764 is accessed using the strategy '.'. I.e. this directive
765 is the way to set the preferred search strategy per data‐
766 base. For example, instead of strategy lev , the strategy
767 word may be preferred for databases mainly containing the
768 multiword phrases but the single words.
769
770 Virtual Database Specification
771 The virtual database specification describes the virtual data‐
772 base:
773
774 database_list string
775 Specifies a list of databases which are included into the
776 virtual database. Database names are in the string and
777 are separated by comma.
778
779 name string
780 Specifies the short name of the database. See database
781 specification
782
783 info string
784 Specifies the information about database. See database
785 specification
786
787 invisible
788 Makes dictionary invisible to the clients. See database
789 specification
790
791 disable_strategy string
792 Disables the specified strategy for database. See data‐
793 base specification
794
795 Plugin Specification
796
797 plugin string
798 Specifies a filename of the plugin.
799
800 data string
801 Specifies data for initializing plugin.
802
803 name string
804 Specifies the short name of the database. See Database
805 Specification for more information.
806
807 info string
808 Specifies the information about database. See Database
809 Specification for more information.
810
811 invisible
812 Makes dictionary invisible to the clients. See Database
813 Specification for more information.
814
815 disable_strategy string
816 Disables the specified strategy for database. See Data‐
817 base Specification for more information.
818
819 default_strategy string
820 Sets the default search strategy for database. See Data‐
821 base Specification for more information.
822
823 Mime Specification
824
825 dbname_nomime string
826 Specifies the real database name which is used in case
827 OPTION MIME command was NOT received from a client.
828
829 dbname_mime string
830 Specifies the real database name which is used in case
831 OPTION MIME command WAS received from a client. A neces‐
832 sary MIME header is set while creating a database. See
833 dictfmt(1) for option --mime-header.
834
835 name string
836 Specifies the short name of the database. See Database
837 Specification for more information.
838
839 info string
840 Specifies the information about database. See Database
841 Specification for more information.
842
843 invisible
844 Makes dictionary invisible to the clients. See Database
845 Specification for more information.
846
847 disable_strategy string
848 Disables the specified strategy for database. See Data‐
849 base Specification for more information.
850
851 default_strategy string
852 Sets the default search strategy for database. See Data‐
853 base Specification for more information.
854
855 include string
856 The text of the file "string" (usually a database specification)
857 will be read as if it appeared at this location in the configu‐
858 ration file. Nested includes are not permitted.
859
861 When a client connects, the global access specification is scanned, in
862 order, until a specification matches. If no access specification
863 exists, all access is allowed (e.g., the action is the same as if
864 "allow *" was the only item in the specification). For each item, both
865 the hostname and IP are checked. For example, consider the following
866 access specification:
867 allow 10.42.*
868 authonly *.edu
869 deny *
870 With this specification, all clients in the 10.42 network will be
871 allowed access to unrestricted databases; all clients from *.edu sites
872 will be allowed to authenticate, but will be denied access to all data‐
873 bases, even those which are otherwise unrestricted; and all other
874 clients will have their connection terminated immediately. The 10.42
875 network clients can send an AUTH command and gain access to restricted
876 databases. The *.edu clients must send an AUTH command to gain access
877 to any databases, restricted or unrestricted.
878
879 When the AUTH command is sent, the access list for each database is
880 scanned, in order, just as the global access list is scanned. However,
881 after authentication, the client has an associated username. For exam‐
882 ple, consider the following access specification:
883 user u1
884 deny *.com
885 user u2
886 allow *
887 If the client authenticated as u1, then the client will have access to
888 this database, even if the client comes from a *.com site. In con‐
889 trast, if the client authenticated as u2, the client will only have
890 access if it does not come from a *.com site. In this case, the "user
891 u2" is redundant, since that client would also match "allow *".
892
893 Warning: Checks are performed for domain names and for IP addresses.
894 However, if reverse DNS for a specific site is not working, it is pos‐
895 sible that a domain name may not be available for checking. Make sure
896 that all denials use IP addresses. (And consider a future enhancement:
897 if a domain name is not available, should denials that depend on a
898 domain name match anything? This is the more conservative viewpoint,
899 but it is not currently implemented.)
900
902 The DICT standard specifies a few search algorithms that must be imple‐
903 mented, and permits others to be supported on a server-dependent basis.
904 The following search strategies are supported by this server. Note
905 that all strategies are case insensitive. Most ignore non-alphanu‐
906 meric, non-whitespace characters.
907
908 exact An exact match. This algorithm uses a binary search and is one
909 of the fastest search algorithms available.
910
911 lev The Levenshtein algorithm (string edit distance of one). This
912 algorithm searches for all words which are within an edit dis‐
913 tance of one from the target word. An "edit" means an inser‐
914 tion, deletion, or transposition. This is a rapid algorithm for
915 correcting spelling errors, since many spelling errors are
916 within a Levenshtein distance of one from the original word.
917
918 prefix Prefix match. This algorithm also uses a binary search and is
919 very fast.
920
921 nprefix
922 Like prefix but returns the specified range of matches. For
923 example, when prefix strategy returns 1000 matches, you can get
924 only 100 ones skipping the first 800 matches. This is made by
925 specified these limits in a query like this: 800#100#app, where
926 800 is skip count, 100 is a number of matches you want to get
927 and "app" is your query. This strategy allows to implement DICT
928 client with fast autocompletion (although it is not trivial)
929 just like many standalone dictionary programs do.
930
931 NOTE: If you access the dictionary "*" (or virtual one) with
932 nprefix strategy, the same range is set for each database in it,
933 but globally for all matches found in all databases.
934
935 NOTE: In case you access non-english dictionary the returned
936 matches may be (and mostly will be) NOT ordered in alphabetic
937 order.
938
939 re POSIX 1003.2 (modern) regular expression search. Modern regular
940 expressions are the ones used by egrep(1). These regular
941 expressions allow predefined character classes (e.g.,
942 [[:alnum:]], [[:alpha:]], [[:digit:]], and [[:xdigit:]] are use‐
943 ful for this application); uses * to match a sequence 0 or more
944 matches of the previous atom; uses + to match a sequence of 1 or
945 more matches of the previous atom; uses ? to match a sequence of
946 0 or 1 matches of the previous atom; used ^ to match the begin‐
947 ning of a word, uses $ to match the end of a word, and allows
948 nested subexpression and alternation with () and |. For exam‐
949 ple, "(foo|bar)" matches all words that contain either "foo" or
950 "bar". To match these special characters, they must be quoted
951 with two backslashes (due to the quoting characteristics of the
952 server). Warning: Regular expression matches can take 10 to 300
953 times longer than substring matches. On a busy server, with
954 many databases, this can required more than 5 minutes of waiting
955 time, depending on the complexity of the regular expression.
956
957 regexp Old (basic) regular expressions. These regular expressions
958 don't support |, +, or ?. Groups use escaped parentheses.
959 While modern regular expressions are generally easier to use,
960 basic regular expressions have a back reference feature. This
961 can be used to match a second occurrence of something that was
962 already matched. For example, the following expression finds
963 all words that begin and end with the same three letters:
964 ^\\(...\\).*\\1$
965
966 Note the use of the double backslashes to escape the special
967 characters. This is required by the DICT protocol string speci‐
968 fication (a single backslash quotes the next character -- we use
969 two to get a single backslash through to the regular expression
970 engine). Warning: Note that the use of backtracking is even
971 slower than the use of general regular expressions.
972
973 soundex
974 The Soundex algorithm, a classic algorithm for finding words
975 that sound similar to each other. The algorithm encodes each
976 word using the first letter of the word and up to three digits.
977 Since the first letter is known, this search is relatively fast,
978 and it sometimes good for correcting spelling errors when the
979 Levenshtein algorithm doesn't help.
980
981 substring
982 Match a substring anywhere in the headword. This search strat‐
983 egy uses a modified Boyer-Moore-Horspool algorithm. Since it
984 must search the whole index file, it is not as fast as the exact
985 and prefix matches.
986
987 suffix Suffix match. This search strategy also uses a modified Boyer-
988 Moore-Horspool algorithm, and is as fast as the substring
989 search. If the optional index_suffix string file is listed in
990 the configuration file this search is much faster.
991
992 word Match any single word, even if part of a multi-word entry. If
993 the optional index_word string file is listed in the configura‐
994 tion file this search strategy works much faster.
995
996 first Match the first word that begins a multi-word entry.
997
998 last Match the last word that ends a multi-word entry. If the
999 optional index_suffix string file is listed in the configuration
1000 file this search strategy works much faster.
1001
1003 Databases for dictd are distributed separately. A database consists of
1004 two files. One is a flat text file, the other is the index.
1005
1006 The flat text file contains dictionary entries (or any other suitable
1007 data), and the index contains tab-delimited tuples consisting of the
1008 headword, the byte offset at which this entry begins in the flat text
1009 file, and the length of the entry in bytes. The offset and length are
1010 encoded using base 64 encoding using the 64-character subset of Inter‐
1011 national Alphabet IA5 discussed in RFC 1421 (printable encoding) and
1012 RFC 1522 (base64 MIME). Encoding the offsets in base 64 saves consid‐
1013 erable space when compared with the usual base 10 encoding, while still
1014 permitting tab characters (ASCII 9) to be used for delimiting fields in
1015 a record. Each record ends with a newline (ASCII 10), so the index
1016 file is human readable.
1017
1018 Some headwords are used by dictd especially
1019
1020 00-database-info Containts the information about database which is
1021 returned by SHOW INFO command, unless it is specified in the configura‐
1022 tion file.
1023
1024 00-database-short Containts the short name of the database which is
1025 returned by SHOW DB command, unless it is specified in the configura‐
1026 tion file. See dictfmt -s.
1027
1028 00-database-url URL where original dictionary sources were obtained
1029 from. See dictfmt -u. This headword is not used by dictd
1030
1031 00-database-utf8 Presents if dictionary is encoded using UTF-8. See
1032 dictfmt --utf8
1033
1034 00-database-8bit-new Presents if dictionary is encoded using 8-BIT
1035 character set (not ASCII and not UTF8). See dictfmt --locale.
1036
1037 The flat text file may be compressed using gzip(1) (not recommended) or
1038 dictzip(1) (highly recommended). Optimal speed will be obtained using
1039 an uncompressed file. However, the gzip compression algorithm works
1040 very well on plain text, and can result in space savings typically
1041 between 60 and 80%. Using a file compressed with gzip(1) is not recom‐
1042 mended, however, because random access on the file can only be accom‐
1043 plished by serially decompressing the whole file, a process which is
1044 prohibitively slow. dictzip(1) uses the same compression algorithm and
1045 file format as does gzip(1), but provides a table that can be used to
1046 randomly access compressed blocks in the file. The use of 50-64kB
1047 blocks for compression typically degrades compression by less than 10%,
1048 while maintaining acceptable random access capabilities for all data in
1049 the file. As an added benefit, files compressed with dictzip(1) can be
1050 decompressed with gzip(1) or zcat(1). (Note: recompressing a dictzip'd
1051 file using, for example, znew(1) will destroy the random access charac‐
1052 teristics of the file. Always compress data files using dictzip(1).)
1053
1055 SIGHUP causes dictd to reread configuration file and reinitialize data‐
1056 bases.
1057
1058 SIGUSR1 causes dictd to unload databases. Then dictd returns 420 status
1059 (instead of 220). To load databases again, send SIGHUP signal. Because
1060 database files are mmap'ed(2) , it is impossible to update them while
1061 dictd is running. So, if you need to update database files and reread
1062 configuration file, first, send SIGUSR1 signal to dictd to unload data‐
1063 bases, update files, and then send SUGHUP signal to load them again.
1064
1066 The main source files for the dictd server and the dictzip compression
1067 program were written by Rik Faith (faith@dict.org) and are distributed
1068 under the terms of the GNU General Public License. If you need to dis‐
1069 tribute under other terms, write to the author.
1070
1071 The main libraries used by these programs (zlib, regex, libmaa) are
1072 distributed under different terms, so you may be able to use the
1073 libraries for applications which are incompatible with the GPL --
1074 please see the copyright notices and license information that come with
1075 the libraries for more information, and consult with your attorney to
1076 resolve these issues.
1077
1079 The regular expression searches do not ignore non-whitespace, non-
1080 alphanumeric characters as do the other searches. In practice, this
1081 isn't much of a problem.
1082
1084 Conformance of regular expressions (used by 're' and 'regexp' search
1085 strategies) to ERE and BRE depends on library you build dictd with.
1086 Whether 're' and 'regex' strategies support utf8 depends on library you
1087 build dictd with.
1088
1090 /etc/dictd.conf
1091 dictd configuration file
1092
1093 /usr/sbin/dictd
1094 dictd daemon itself
1095
1096 /var/run/dictd.pid
1097 File for storing pid of dictd daemon
1098
1099 /usr/share
1100 The default directory for dictd databases (.index and .dict[.dz]
1101 files)
1102
1104 examples/dictd*.conf, dictfmt(1), dict(1), dictzip(1), gunzip(1),
1105 zcat(1), webster(1), RFC 2229
1106
1107
1108
1109 29 March 2002 DICTD(8)