1OVDB(5) InterNetNews Documentation OVDB(5)
2
3
4
6 ovdb - Overview storage method for INN
7
9 The ovdb overview is a storage method that uses the Berkeley DB library
10 to store overview data. It requires version 4.4 or later of the
11 Berkeley DB library (4.7+ is recommended because older versions suffer
12 from various issues).
13
14 The ovdb overview method makes use of the full
15 transaction/logging/locking functionality of the Berkeley DB
16 environment. Berkeley DB may be downloaded from
17 <https://www.oracle.com/database/technologies/related/berkeleydb.html>
18 and is needed to build the ovdb backend.
19
21 There are several versions of the ovdb storage method:
22
23 • Version 1, the initial version shipped with INN 2.3.0 up to
24 INN 2.3.5.
25
26 • Version 2, with improved performance, since INN 2.4.0.
27
28 • Version 3, corresponding to version 2 with compression enabled,
29 starting with INN 2.5.0.
30
31 If you have a database created with a previous version of ovdb, your
32 database will need to be upgraded using ovdb_init. See the
33 ovdb_init(8) man page for upgrade instructions, as well as the
34 "COMPRESSION" section below.
35
36 Note that when the Berkeley DB library is updated to a newer version,
37 the ovdb database also needs being upgraded.
38
40 If the Berkeley DB library is found at configure time, INN will be
41 built with Berkeley DB support unless the --without-bdb flag is
42 explicitly passed to configure. By default, configure will search for
43 Berkeley DB in standard locations; there will be a message in the
44 configure output indicating the pathname that will be used.
45
46 You can override this pathname by adding a path to the option, for
47 instance --with-bdb=/usr/BerkeleyDB.4.4. This directory is expected to
48 have subdirectories include and lib (lib32 and lib64 are also checked),
49 containing respectively db.h, and the library itself. In case non-
50 standard paths to the Berkeley DB libraries are used, one or both of
51 the options --with-bdb-include and --with-bdb-lib can be given to
52 configure with a path.
53
54 The ovdb database may take up more disk space for a given spool than
55 the other overview methods. Plan on needing at least 1.1 KB for every
56 article in your spool (not counting crossposts). So, if you have 5
57 million articles, you'll need at least 5.5 GB of disk space for ovdb.
58 With compression enabled, this estimate changes to 0.9 KB per article,
59 so you'll need at least 4.5 GB of disk space for 5 million articles.
60 See the "COMPRESSION" section below. Plus, you'll need additional
61 space for transaction logs: at least 100 MB. By default, the
62 transaction logs go in the same directory as the database. To improve
63 performance, they can be placed on a different disk -- see the
64 "DB_CONFIG" section.
65
67 To enable the ovdb overview method, set the ovmethod parameter in
68 inn.conf to "ovdb". The ovdb database is stored in the directory
69 specified by the pathoverview parameter in inn.conf. This is the
70 "DB_HOME" directory. To start out, this directory should be empty
71 (other than an optional DB_CONFIG file; see "DB_CONFIG" for details),
72 and innd (or makehistory) will create the files as necessary in that
73 directory. Also, make sure the directory is owned by the news user.
74
75 Other parameters for configuring ovdb are in the ovdb.conf
76 configuration file. The following parameters can be set in that file:
77
78 compress
79 If INN was compiled with zlib, and this compress parameter is true,
80 ovdb will compress overview records that are longer than 600 bytes.
81 See the "COMPRESSION" section below.
82
83 cachesize
84 Size of the memory pool cache, in kilobytes. The cache will have a
85 backing store file in the DB directory which will be at least as
86 big. In general, the bigger the cache, the better. Use "ovdb_stat
87 -m" to see cache hit percentages. To make a change of this
88 parameter take effect, shut down and restart INN (be sure to kill
89 all of the nnrpd processes when shutting down). Default is 8000
90 (KB), which is adequate for small to medium-sized servers. Large
91 servers will probably need at least 20000 (KB).
92
93 ncache
94 Number of regions across which to split the cache. The region size
95 is equal to cachesize divided by ncache. Default is 1 for ncache,
96 that is to say the cache will be allocated contiguously in memory.
97
98 numdbfiles
99 Overview data is split between this many files. Currently, innd
100 will keep all of the files open, so don't set this too high or innd
101 may run out of file descriptors. nnrpd only opens one at a time,
102 regardless. May be set to one, or just a few, but only do that if
103 your OS supports large (> 2 GB) files. Changing this parameter has
104 no effect on an already-established database. Default is 32.
105
106 txn_nosync
107 If txn_nosync is set to false, Berkeley DB flushes the log after
108 every transaction. This minimizes the number of transactions that
109 may be lost in the event of a crash, but results in significantly
110 degraded performance. Default is true.
111
112 useshm
113 If useshm is set to true, Berkeley DB will use shared memory
114 instead of mmap for its environment regions (cache, lock, etc).
115 With some platforms, this may improve performance. Default is
116 false.
117
118 shmkey
119 Sets the shared memory key used by Berkeley DB when useshm is true.
120 Berkeley DB will create several (usually 5) shared memory segments,
121 using sequentially numbered keys starting with "shmkey". Choose a
122 key that does not conflict with any existing shared memory segments
123 on your system. Default is 6400.
124
125 pagesize
126 Sets the page size for the DB files (in bytes). Must be a power of
127 2. Best choices are 4096 or 8192. The default is 8192. Changing
128 this parameter has no effect on an already-established database.
129
130 minkey
131 Sets the minimum number of keys per page. See the Berkeley DB
132 documentation for more information. Default is based on page size
133 and whether compression is enabled:
134
135 default_minkey = MAX(2, pagesize / 2600) if compress is false
136 default_minkey = MAX(2, pagesize / 1500) if compress is true
137
138 The lowest allowed minkey is 2. Setting minkey higher than the
139 default is not recommended, as it will cause the databases to have
140 a lot of overflow pages. Changing this parameter has no effect on
141 an already-established database.
142
143 maxlocks
144 Sets the Berkeley DB lk_max parameter, which is the maximum number
145 of locks that can exist in the database at the same time. Default
146 is 4000.
147
148 nocompact
149 The nocompact parameter affects the behaviour of expireover. The
150 expireover function in ovdb can do its job in one of two ways: by
151 simply deleting expired records from the database; or by re-writing
152 the overview records into a different location leaving out the
153 expired records. The first method is faster, but it leaves 'holes'
154 that result in space that can not immediately be reused. The
155 second method 'compacts' the records by rewriting them.
156
157 If this parameter is set to 0, expireover will compact all
158 newsgroups; if set to 1, expireover will not compact any
159 newsgroups; and if set to a value greater than one, expireover will
160 only compact groups that have less than that number of articles.
161
162 Experience has shown that compacting has minimal effect (other than
163 making expireover take longer) so the default is 1. This parameter
164 will probably be removed in the future.
165
166 readserver
167 When the readserver parameter is set to false, each nnrpd process
168 directly accesses the Berkeley DB environment. The process of
169 attaching to the database (and detaching when finished) is fairly
170 expensive, and can result in high loads in situations when there
171 are lots of reader connections of relatively short duration.
172
173 When the readserver parameter is set to true, the nnrpd processes
174 will access overview via a helper server (ovdb_server -- which is
175 started by ovdb_init). All ovdb reads will then be funnelled
176 through a single process with a cleaner interface to the underlying
177 Berkeley DB database. This will result in cleaner shutdowns for
178 the database, improving stability and avoiding deadlocks, timing
179 issues and corrupted databases. That's why you should try to set
180 this parameter to true if you are experiencing any instability in
181 the ovdb overview method.
182
183 Default value is true.
184
185 numrsprocs
186 This parameter is only used when readserver is true. It sets the
187 number of ovdb_server processes. As each ovdb_server can process
188 only one transaction at a time, running more servers can improve
189 reader response times. Default is 5.
190
191 maxrsconn
192 This parameter is only used when readserver is true. It sets a
193 maximum number of readers that a given ovdb_server process will
194 serve at one time. This means the maximum number of readers for
195 all of the ovdb_server processes is (numrsprocs * maxrsconn). This
196 does not limit the actual number of readers, since nnrpd will fall
197 back to opening the database directly if it can't connect to an
198 ovdb_server. Default is 0, which means an unlimited number of
199 connections is allowed.
200
202 The ovdb storage method has the ability to compress overview data
203 before it is stored into the database. In addition to consuming less
204 disk space, compression keeps the average size of the database keys
205 smaller. This in turn increases the average number of keys per page,
206 which can significantly improve performance and also helps keep the
207 database more compact. This feature requires that INN be built with
208 zlib. Only records larger than 600 bytes get compressed, because that
209 is the point at which compression starts to become significant.
210
211 If compression is not enabled (either from the compress option in
212 ovdb.conf or INN was not built with zlib support), the database will be
213 backward compatible with older versions of ovdb. However, if
214 compression is enabled, the database is marked with a newer version
215 that will prevent older versions of ovdb from opening the database.
216
217 You can upgrade an existing database to use compression simply by
218 setting compress to true in ovdb.conf. Note that existing records in
219 the database will remain uncompressed; only new records added after
220 enabling compression will be compressed.
221
222 If you disable compression on a database that previously had it
223 enabled, new records will be stored uncompressed, but the database will
224 still be incompatible with older versions of ovdb (and will also be
225 incompatible with this version of ovdb if INN was not built with zlib
226 support). So to downgrade to a completely uncompressed database, you
227 will have to rebuild the database using makehistory.
228
230 A file called DB_CONFIG may be placed in the database directory
231 (pathoverview in inn.conf) to customize where the various database
232 files and transaction logs are written. By default, all of the files
233 are written in the "DB_HOME" directory. One way to improve performance
234 is to put the transaction logs on a different disk. To do this, put:
235
236 DB_LOG_DIR /path/to/logs
237
238 in the DB_CONFIG file. If the pathname you give starts with a "/", it
239 is treated as an absolute path; otherwise, it is relative to the
240 "DB_HOME" directory. Make sure that any directories you specify exist
241 and have proper ownership/mode before starting INN, because they won't
242 be created automatically. Also, don't change the DB_CONFIG file while
243 anything that uses ovdb is running.
244
245 Another thing that you can do with this file is to split the overview
246 database across multiple disks. In the DB_CONFIG file, you can list
247 directories that Berkeley DB will search when it goes to open a
248 database.
249
250 For example, let's say that you have pathoverview set to /mnt/overview
251 and you have four additional file systems created on /mnt/ovX. You
252 would create a file /mnt/overview/DB_CONFIG containing the following
253 lines:
254
255 set_data_dir /mnt/overview
256 set_data_dir /mnt/ov1
257 set_data_dir /mnt/ov2
258 set_data_dir /mnt/ov3
259 set_data_dir /mnt/ov4
260
261 Distribute your ovNNNNN files into the four filesystems (say, 8 each).
262 When called upon to open a database file, the db library will look for
263 it in each of the specified directories (in order). If said file is
264 not found, one will be created in the first of those directories.
265
266 Whenever you change DB_CONFIG or move database files around, make sure
267 all news processes that use the database are shut down first (including
268 nnrpd processes).
269
270 The DB_CONFIG functionality is part of Berkeley DB itself, rather than
271 something provided by ovdb. See the Berkeley DB documentation for
272 complete details for the version of Berkeley DB that you're running.
273
275 When starting the news system, rc.news will invoke the ovdb_init
276 program. See the ovdb_init(8) man page for information about the tasks
277 it performs. ovdb_init must be run before using the database.
278
279 And when stopping INN, rc.news kills the ovdb_monitor processes after
280 the other INN processes have been shut down.
281
283 Problems relating to ovdb are logged to news.err with "OVDB" in the
284 error message.
285
286 INN programs that use overview will fail to start up if the
287 ovdb_monitor processes aren't running. Be sure to run ovdb_init before
288 running anything that accesses overview.
289
290 Also, INN programs that use overview will fail to start up if the user
291 running them is not the news user.
292
293 If a program accessing the database crashes, or otherwise exits
294 uncleanly, it might leave a stale lock in the database. This lock
295 could cause other processes to deadlock on that stale lock. To fix
296 this, shut down all news processes (using "kill -9" if necessary) and
297 then restart. ovdb_init should perform a recovery operation which will
298 remove the locks and repair damage caused by killing the deadlocked
299 processes.
300
302 pathetc/inn.conf
303 The ovmethod and pathoverview parameters are relevant to ovdb.
304
305 pathetc/ovdb.conf
306 Optional configuration file for tuning. See "CONFIGURATION" above.
307
308 pathoverview
309 Directory where the database goes. Berkeley DB calls it the
310 "DB_HOME" directory.
311
312 pathoverview/DB_CONFIG
313 Optional file to configure the layout of the database files.
314
315 pathrun/ovdb.sem
316 A file that gets locked by every process that is accessing the
317 database. This is used by ovdb_init to determine whether the
318 database is active or quiescent.
319
320 pathrun/ovdb_monitor.pid
321 Contains the process ID of ovdb_monitor.
322
324 Implement a way to limit how many databases can be open at once (to
325 reduce file descriptor usage); maybe using something similar to the
326 cache code in legacy ov3.c file.
327
329 Written by Heath Kehoe <hakehoe@avalon.net> for InterNetNews.
330
332 inn.conf(5), innd(8), makehistory(8), nnrpd(8), ovdb_init(8),
333 ovdb_monitor(8), ovdb_stat(8).
334
335 Berkeley DB documentation: in the docs directory of the Berkeley DB
336 source distribution, or on the Oracle Berkeley DB web page
337 (<https://www.oracle.com/database/technologies/related/berkeleydb.html>).
338
339
340
341INN 2.7.1 2023-03-07 OVDB(5)