1htload(1)                   General Commands Manual                  htload(1)
2
3
4

NAME

6       htload - reads in an ASCII-text version of the document database
7

SYNOPSIS

9       htload [options]
10

DESCRIPTION

12       Htload   reads in an ASCII-text version of the document database in the
13       same form as the  -t  option  of  htdig  and  htdump.  Note  that  this
14       will  overwrite  data  in  your  databases, so this should be used with
15       great care.
16

OPTIONS

18       -a     Use alternate work files. Tells htload to append .work to  data‐
19              base files, allowing it to operate on a second set of databases.
20
21       -c configfile
22              Use the specified configfile instead of the default.
23
24       -i     Initial.  Do  not use any old databases. This is accomplished by
25              first erasing the databases.
26
27       -v     Verbose mode. This doesn't have much effect.
28

File Formats

30       Document Database
31              Each line in the file starts with the document id followed by  a
32              list  of  fieldname : value separated by tabs. The fields always
33              appear in the order listed below:
34
35       u      URL
36
37       t      Title
38
39       a      State (0 = normal, 1 = not found, 2 = not indexed, 3 = obsolete)
40
41       m      Last modification time as reported by the server
42
43       s      Size in bytes
44
45       H      Excerpt
46
47       h      Meta description
48
49       l      Time of last retrieval
50
51       L      Count of the links in the document (outgoing links)
52
53       b      Count of the links to the document (incoming links or backlinks)
54
55       c      HopCount of this document
56
57       g      Signature of the document used for duplicate-detection
58
59       e      E-mail address to use for a notification message from htnotify
60
61       n      Date to send out a notification e-mail message
62
63       S      Subject for a notification e-mail message
64
65       d      The  text  of  links  pointing  to  this  document.   (e.g.   <a
66              href="docURL">description</a>)
67
68       A      Anchors in the document (i.e. <A NAME=...)
69
70       Word Database
71              While  htdump  and  htload  don't  deal  with  the word database
72              directly, it's worth mentioning it here because you need to deal
73              with  it  when  copying  the  ASCII databases from one system to
74              another. The initial word database produced by htdig is  already
75              in  ASCII  format,  and  a  binary  version of it is produced by
76              htmerge, for use by htsearch. So, when you copy over  the  ASCII
77              version of the document database produced by htdump, you need to
78              copy over the wordlist as well, then  run  htload  to  make  the
79              binary  document database on the target system, followed by run‐
80              ning htmerge to make the word index.
81
82       Each line in the word list file starts with the word
83              followed by a list of fieldname : value separated by  tabs.  The
84              fields  always  appear  in the order listed below, with the last
85              two being optional:
86
87       i      Document ID
88
89       l      Location of word in document (1 to 1000)
90
91       w      Weight of word based on scoring factors
92
93       c      Count of word's appearances in document, if more than 1
94
95       a      Anchor number if word occurred after a named anchor
96

FILES

98       /etc/htdig/htdig.conf
99              The default configuration file.
100
101       /var/lib/htdig/db.docs
102              The default ASCII document database file.
103
104       /var/lib/htdig/db.wordlist
105              The default ASCII word database file.
106

SEE ALSO

108       Please  refer  to  the  HTML   pages   (in   the   htdig-doc   package)
109       /usr/share/doc/htdig-doc/html/index.html  and the manual pages htdig(1)
110       , htmerge(1) and htdump(1) for a detailed description of  ht://Dig  and
111       its commands.
112

AUTHOR

114       This manual page was written by Stijn de Bekker, based on the HTML doc‐
115       umentation of ht://Dig.
116
117
118
119                                15 October 2001                      htload(1)
Impressum