1BOGOUTIL(1)               Bogofilter Reference Manual              BOGOUTIL(1)
2
3
4

NAME

6       bogoutil - Dumps, loads, and maintains bogofilter database files
7

SYNOPSIS

9       bogoutil {-h | -V}
10
11       bogoutil [options] {-d file | -H file | -l file | -m file | -w file |
12                -p file}
13
14       bogoutil {-r file | -R file}
15
16       bogoutil {--db-print-leafpage-count file | --db-print-pagesize file |
17                --db-verify file | --db-checkpoint directory [flag...]  |
18                --db-list-logfiles directory | --db-prune directory |
19                --db-recover directory | --db-recover-harder directory |
20                --db-remove-environment directory}
21
22       where options is
23
24       bogoutil [-v] [-n] [-C] [-D] [-a age] [-c count] [-s min,max] [-y date]
25                [-I file] [-O file] [-x flags] [--config-file file]
26

DESCRIPTION

28       Bogoutil is part of the bogofilter Bayesian spam filter package.
29
30       It is used to dump and load bogofilter's Berkeley DB databases to and
31       from text files, perform database maintenance functions, and to display
32       the values for specific words.
33

OPTIONS

35       The -d file option tells bogoutil to print the contents of the database
36       file to stdout.
37
38       The -H file option tells bogoutil to print a histogram of the database
39       file to stdout. The output is similar to bogofilter -vv. Finally,
40       hapaxes (tokens which were only seen once) and pure tokens (tokens
41       which were encountered only in ham or only in spam) are counted.
42
43       The -l file option tells bogoutil to load the data from stdin into the
44       database file. If the database file exists, stdin data is merged into
45       the database file, with counts added up.
46
47       The -m option tells bogoutil to perform maintenance functions on the
48       specified database, i.e. discard tokens that are older than desired,
49       have counts that are too small, or sizes (lengths) that are too long or
50       too short.
51
52       The -w file option tells bogoutil to display token information from the
53       database file. The option takes an argument, which is either the name
54       of the wordlist (usually wordlist.db) or the name of the directory
55       containing it. Tokens can be listed on the command line or piped to
56       bogoutil. When there are extra arguments on the command line, bogoutil
57       will use them as the tokens to lookup. If there are no extra arguments,
58       bogoutil will read tokens from stdin.
59
60       The -p file option tells bogoutil to display the database information
61       for one or more tokens. The display includes a probability column with
62       the token's spam score (computed using bogofilter's default values).
63       Option -p takes the same arguments as option -w .
64
65       The -r file option tells bogoutil to recalculate the ROBX value and
66       print it as a six-digit fraction.
67
68       The -R file option does the same as -r, but saves the result in the
69       training database without printing it.
70
71       The -I file option tells bogoutil to read its input from file rather
72       than stdin.
73
74       The -O file option tells bogoutil to write its output to file rather
75       than stdout.
76
77       The -v option produces verbose output on stderr. This option is
78       primarily useful for debugging.
79
80       The -C inhibits reading configuration files and lets bogoutil go with
81       the defaults.
82
83       The --config-file file option tells bogoutil to read file instead of
84       the standard configuration file.
85
86       The -D redirects debug output to stdout (it usually goes to stderr).
87
88       The -x flags option sets debugging flags.
89
90       Option -n stands for "replace non-ascii characters". It will replace
91       characters with the high bit (0x80) by question marks. This can be
92       useful if a word list has lots of unreadable tokens, for example from
93       Asian spam. The "bad" characters will be converted to question marks
94       and matching tokens will be combined when used with -m or -l, but not
95       with -d.
96
97       Option -a age indicates an acceptable token age, with older ones being
98       discarded. The age can be a date (in form YYYYMMMDD) or a day count,
99       i.e. discard tokens older than age days.
100
101       Option -c value indicates that tokens with counts less than or equal to
102       value are to be discarded.
103
104       Option -s min,max is used to discard tokens based on their size, i.e.
105       length. All tokens shorter than min or longer than max will be
106       discarded.
107
108       Option -y date is specifies the date to give to tokens that don't have
109       dates. The format is YYYYMMDD.
110
111       The -h option prints the help message and exits.
112
113       The -V option prints the version number and exits.
114

ENVIRONMENT MAINTENANCE

116       The --db-checkpoint dir option causes bogoutil to flush the buffer
117       caches and checkpoint the database environment.
118
119       The --db-list-logfiles dir option causes bogoutil to list the log files
120       in the environment. Zero or more keywords can be added or combined
121       (separated by whitespace) to modify the behavior of this mode. The
122       default behavior is to list only inactive log files with relative
123       paths. You can add all to list all log files (inactive and active). You
124       can add absolute to switch the listing to absolute paths.
125
126       The --db-prune dir option causes bogoutil to checkpoint the database
127       environment and remove inactive log files.
128
129       The --db-recover dir option runs a regular database recovery in the
130       specified database directory. If that fails, it will retry with a
131       (usually slower) catastrophic database recovery. If that fails, too,
132       your database cannot be repaired and must be rebuilt from scratch. This
133       is only supported when compiled with Berkeley DB support with
134       transactions enabled. Trying recovery with QDBM or SQLite3 support will
135       result in an error.
136
137       The --db-recover-harder dir option runs a catastrophic data base
138       recovery in the specified database directory. If that fails, your
139       database cannot be repaired and must be rebuilt from scratch. This is
140       only supported when compiled with Berkeley DB support with transactions
141       enabled. Trying recovery with QDBM or SQLite3 support will result in an
142       error.
143
144       The --db-remove-environment directory option has no short option
145       equivalent. It runs recovery in the given directory and then removes
146       the database environment. Use this before upgrading to a new Berkeley
147       DB version if the new version to be installed requires a log file
148       format update.
149
150       The --db-print-leafpage-count file option prints the number of leaf
151       pages in the database file file as a decimal number, or UNKNOWN if the
152       database does not support querying this figure.
153
154       The --db-print-pagesize file option prints the size of a database page
155       in file as a decimal number, or UNKNOWN for databases with variable
156       page size or databases that do not allow a query of the database page
157       size.
158
159       The --db-verify file option requests that bogofilter verifies the
160       database file. It prints only errors, unless in verbose mode.
161

DATA FORMAT

163       Bogoutil reads and writes text files where each nonblank line consists
164       of a word, any amount of horizontal whitespace, a numeric word count,
165       more whitespace, and (optionally) a date in form YYYYMMDD. Blank lines
166       are skipped.
167

RETURN VALUES

169       0 for successful operation. 1 for most errors. 3 for I/O or other
170       errors. Error 3 usually means that something is seriously wrong with
171       the database files.
172

AUTHOR

174       Gyepi Sam gyepi@praxis-sw.com.
175
176       Matthias Andree matthias.andree@gmx.de.
177
178       David Relson relson@osagesoftware.com.
179
180       For updates, see the bogofilter project page[1].
181

SEE ALSO

183       bogofilter(1), bogolexer(1), bogotune(1), bogoupgrade(1)
184

NOTES

186        1. the bogofilter project page
187           http://bogofilter.sourceforge.net/
188
189
190
191Bogofilter                        10/22/2012                       BOGOUTIL(1)
Impressum