1htdump(1) General Commands Manual htdump(1)
2
3
4
6 htdump - write out an ASCII-text version of the document database
7
9 htdump [options]
10
12 Htdump writes out an ASCII-text version of the document database in the
13 same form as the -t option of htdig.
14
16 -a Use alternate work files. Tells htdump to append .work to data‐
17 base files, allowing it to operate on a second set of databases.
18
19 -c configfile
20 Use the specified configfile instead of the default.
21
22 -v Verbose mode. This doesn't have much effect.
23
25 Document Database
26 Each line in the file starts with the document id followed by a
27 list of fieldname : value separated by tabs. The fields always
28 appear in the order listed below:
29
30 u URL
31
32 t Title
33
34 a State (0 = normal, 1 = not found, 2 = not indexed, 3 = obsolete)
35
36 m Last modification time as reported by the server
37
38 s Size in bytes
39
40 H Excerpt
41
42 h Meta description
43
44 l Time of last retrieval
45
46 L Count of the links in the document (outgoing links)
47
48 b Count of the links to the document (incoming links or backlinks)
49
50 c HopCount of this document
51
52 g Signature of the document used for duplicate-detection
53
54 e E-mail address to use for a notification message from htnotify
55
56 n Date to send out a notification e-mail message
57
58 S Subject for a notification e-mail message
59
60 d The text of links pointing to this document. (e.g. <a
61 href="docURL">description</a>)
62
63 A Anchors in the document (i.e. <A NAME=...)
64
65 Word Database
66 While htdump and htload don't deal with the word database
67 directly, it's worth mentioning it here because you need to deal
68 with it when copying the ASCII databases from one system to
69 another. The initial word database produced by htdig is already
70 in ASCII format, and a binary version of it is produced by
71 htmerge, for use by htsearch. So, when you copy over the ASCII
72 version of the document database produced by htdump, you need to
73 copy over the wordlist as well, then run htload to make the
74 binary document database on the target system, followed by run‐
75 ning htmerge to make the word index.
76
77 Each line in the word list file starts with the word
78 followed by a list of fieldname : value separated by tabs. The
79 fields always appear in the order listed below, with the last
80 two being optional:
81
82 i Document ID
83
84 l Location of word in document (1 to 1000)
85
86 w Weight of word based on scoring factors
87
88 c Count of word's appearances in document, if more than 1
89
90 a Anchor number if word occurred after a named anchor
91
93 /etc/htdig/htdig.conf
94 The default configuration file.
95
96 /var/lib/htdig/db.docs
97 The default ASCII document database file.
98
99 /var/lib/htdig/db.wordlist
100 The default ASCII word database file.
101
103 Please refer to the HTML pages (in the htdig-doc package)
104 /usr/share/doc/htdig-doc/html/index.html and the manual pages htdig(1)
105 , and htload(1) for a detailed description of ht://Dig and its com‐
106 mands.
107
109 This manual page was written by Stijn de Bekker, based on the HTML doc‐
110 umentation of ht://Dig.
111
112
113
114 15 October 2001 htdump(1)