1ESTWAVER(3) Hyper Estraier ESTWAVER(3)
2
3
4
6 estwaver - command line interface of web crawler
7
8
10 estwaver init [-apn|-acc] [-xs|-xl|-xh] [-sv|-si|-sa] rootdir
11
12 estwaver crawl [-restart|-revisit|-revcont] rootdir
13
14 estwaver unittest rootdir
15
16 estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url
17
18
20 estwaver is an aggregation of sub commands. The name of a sub command
21 is specified by the first argument. Other arguments are parsed accord‐
22 ing to each sub command. The argument rootdir specifies the crawler
23 root directory which contains configuration file and so on.
24
25 estwaver init [-apn|-acc] [-xs|-xl|-xh] [-sv|-si|-sa] rootdir
26 Create the crawler root directory.
27 If -apn is specified, N-gram analysis is performed against Euro‐
28 pean text also.
29 If -acc is specified, character category analysis is performed
30 instead of N-gram analysis.
31 If -xs is specified, the index is tuned to register less than
32 50000 documents.
33 If -xl is specified, the index is tuned to register more than
34 300000 documents.
35 If -xh is specified, the index is tuned to register more than
36 1000000 documents.
37 If -sv is specified, scores are stored as void.
38 If -si is specified, scores are stored as 32-bit integer.
39 If -sa is specified, scores are stored as-is and marked not to
40 be tuned when search.
41
42 estwaver crawl [-restart|-revisit|-revcont] rootdir
43 Start crawling.
44 If -restart is specified, crawling is restarted from the seed
45 documents.
46 If -revisit is specified, collected documents are revisited.
47 If -revcont is specified, collected documents are revisited and
48 then crawling is continued.</dd>
49
50 estwaver unittest rootdir
51 Perform unit tests.
52
53 estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url
54 Fetch a document.
55 url specifies the URL of a document.
56 -proxy specifies the host name and the port number of the proxy
57 server.
58 -tout specifies timeout in seconds.
59 -il specifies the preferred language. By default, it is Eng‐
60 lish.
61
62 All sub commands return 0 if the operation is success, else return 1.
63 A running crawler finishes with closing the database when it catches
64 the signal 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), or 15 (SIGTERM).
65
66 When crawling finishes, there is a directory _index in the crawler root
67 directory. It is an index available by estcmd and so on.
68
69
71 estconfig(1), estcmd(1), estmaster(1), estcall(1), estraier(3), estn‐
72 ode(3)
73
74
75
76Man Page 2007-03-06 ESTWAVER(3)