1OVDB(5) InterNetNews Documentation OVDB(5)
2
3
4
6 ovdb - Overview storage method for INN
7
9 The ovdb overview is a storage method that uses the Berkeley DB library
10 to store overview data. It requires version 4.4 or later of the
11 Berkeley DB library (4.7+ is recommended because older versions suffer
12 from various issues).
13
14 The ovdb overview method makes use of the full
15 transaction/logging/locking functionality of the Berkeley DB
16 environment. Berkeley DB may be downloaded from
17 <http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html>
18 and is needed to build the ovdb backend.
19
21 There are several versions of the ovdb storage method:
22
23 · Version 1, the initial version shipped with INN 2.3.0 up to
24 INN 2.3.5.
25
26 · Version 2, with improved performance, since INN 2.4.0.
27
28 · Version 3, corresponding to version 2 with compression enabled,
29 starting with INN 2.5.0.
30
31 If you have a database created with a previous version of ovdb, your
32 database will need to be upgraded using ovdb_init. See the
33 ovdb_init(8) man page for upgrade instructions, as well as the
34 COMPRESSION section below.
35
36 Note that when the Berkeley DB library is updated to a newer version,
37 the ovdb database also needs being upgraded.
38
40 If the Berkeley DB library is found at configure time, INN will be
41 built with Berkeley DB support unless the --without-bdb flag is
42 explicitly passed to configure. By default, configure will search for
43 Berkeley DB in standard locations; there will be a message in the
44 configure output indicating the pathname that will be used.
45
46 You can override this pathname by adding a path to the option, for
47 instance --with-bdb=/usr/BerkeleyDB.4.4. This directory is expected to
48 have subdirectories include and lib (lib32 and lib64 are also checked),
49 containing respectively db.h, and the library itself. In case non-
50 standard paths to the Berkeley DB libraries are used, one or both of
51 the options --with-bdb-include and --with-bdb-lib can be given to
52 configure with a path.
53
54 The ovdb database may take up more disk space for a given spool than
55 the other overview methods. Plan on needing at least 1.1 KB for every
56 article in your spool (not counting crossposts). So, if you have 5
57 million articles, you'll need at least 5.5 GB of disk space for ovdb.
58 With compression enabled, this estimate changes to 0.7 KB per article.
59 See the COMPRESSION section below. Plus, you'll need additional space
60 for transaction logs: at least 100 MB. By default, the transaction
61 logs go in the same directory as the database. To improve performance,
62 they can be placed on a different disk -- see the DB_CONFIG section.
63
65 To enable the ovdb overview method, set the ovmethod parameter in
66 inn.conf to "ovdb". The ovdb database is stored in the directory
67 specified by the pathoverview parameter in inn.conf. This is the
68 "DB_HOME" directory. To start out, this directory should be empty
69 (other than an optional DB_CONFIG file; see DB_CONFIG for details), and
70 innd (or makehistory) will create the files as necessary in that
71 directory. Also, make sure the directory is owned by the news user.
72
73 Other parameters for configuring ovdb are in the ovdb.conf
74 configuration file. The following parameters can be set in that file:
75
76 compress
77 If INN was compiled with zlib, and this compress parameter is true,
78 ovdb will compress overview records that are longer than 600 bytes.
79 See the COMPRESSION section below.
80
81 cachesize
82 Size of the memory pool cache, in kilobytes. The cache will have a
83 backing store file in the DB directory which will be at least as
84 big. In general, the bigger the cache, the better. Use "ovdb_stat
85 -m" to see cache hit percentages. To make a change of this
86 parameter take effect, shut down and restart INN (be sure to kill
87 all of the nnrpd processes when shutting down). Default is 8000
88 (KB), which is adequate for small to medium-sized servers. Large
89 servers will probably need at least 20000 (KB).
90
91 ncache
92 Number of regions across which to split the cache. The region size
93 is equal to cachesize divided by ncache. Default is 1 for ncache,
94 that is to say the cache will be allocated contiguously in memory.
95
96 numdbfiles
97 Overview data is split between this many files. Currently, innd
98 will keep all of the files open, so don't set this too high or innd
99 may run out of file descriptors. nnrpd only opens one at a time,
100 regardless. May be set to one, or just a few, but only do that if
101 your OS supports large (> 2 GB) files. Changing this parameter has
102 no effect on an already-established database. Default is 32.
103
104 txn_nosync
105 If txn_nosync is set to false, Berkeley DB flushes the log after
106 every transaction. This minimizes the number of transactions that
107 may be lost in the event of a crash, but results in significantly
108 degraded performance. Default is true.
109
110 useshm
111 If useshm is set to true, Berkeley DB will use shared memory
112 instead of mmap for its environment regions (cache, lock, etc).
113 With some platforms, this may improve performance. Default is
114 false.
115
116 shmkey
117 Sets the shared memory key used by Berkeley DB when useshm is true.
118 Berkeley DB will create several (usually 5) shared memory segments,
119 using sequentially numbered keys starting with "shmkey". Choose a
120 key that does not conflict with any existing shared memory segments
121 on your system. Default is 6400.
122
123 pagesize
124 Sets the page size for the DB files (in bytes). Must be a power of
125 2. Best choices are 4096 or 8192. The default is 8192. Changing
126 this parameter has no effect on an already-established database.
127
128 minkey
129 Sets the minimum number of keys per page. See the Berkeley DB
130 documentation for more information. Default is based on page size
131 and whether compression is enabled:
132
133 default_minkey = MAX(2, pagesize / 2600) if compress is false
134 default_minkey = MAX(2, pagesize / 1500) if compress is true
135
136 The lowest allowed minkey is 2. Setting minkey higher than the
137 default is not recommended, as it will cause the databases to have
138 a lot of overflow pages. Changing this parameter has no effect on
139 an already-established database.
140
141 maxlocks
142 Sets the Berkeley DB lk_max parameter, which is the maximum number
143 of locks that can exist in the database at the same time. Default
144 is 4000.
145
146 nocompact
147 The nocompact parameter affects the behaviour of expireover. The
148 expireover function in ovdb can do its job in one of two ways: by
149 simply deleting expired records from the database; or by re-writing
150 the overview records into a different location leaving out the
151 expired records. The first method is faster, but it leaves 'holes'
152 that result in space that can not immediately be reused. The
153 second method 'compacts' the records by rewriting them.
154
155 If this parameter is set to 0, expireover will compact all
156 newsgroups; if set to 1, expireover will not compact any
157 newsgroups; and if set to a value greater than one, expireover will
158 only compact groups that have less than that number of articles.
159
160 Experience has shown that compacting has minimal effect (other than
161 making expireover take longer) so the default is 1. This parameter
162 will probably be removed in the future.
163
164 readserver
165 When the readserver parameter is set to false, each nnrpd process
166 directly accesses the Berkeley DB environment. The process of
167 attaching to the database (and detaching when finished) is fairly
168 expensive, and can result in high loads in situations when there
169 are lots of reader connections of relatively short duration.
170
171 When the readserver parameter is set to true, the nnrpd processes
172 will access overview via a helper server (ovdb_server -- which is
173 started by ovdb_init). All ovdb reads will then be funnelled
174 through a single process with a cleaner interface to the underlying
175 Berkeley DB database. This will result in cleaner shutdowns for
176 the database, improving stability and avoiding deadlocks, timing
177 issues and corrupted databases. That's why you should try to set
178 this parameter to true if you are experiencing any instability in
179 the ovdb overview method.
180
181 Default value is true.
182
183 numrsprocs
184 This parameter is only used when readserver is true. It sets the
185 number of ovdb_server processes. As each ovdb_server can process
186 only one transaction at a time, running more servers can improve
187 reader response times. Default is 5.
188
189 maxrsconn
190 This parameter is only used when readserver is true. It sets a
191 maximum number of readers that a given ovdb_server process will
192 serve at one time. This means the maximum number of readers for
193 all of the ovdb_server processes is (numrsprocs * maxrsconn). This
194 does not limit the actual number of readers, since nnrpd will fall
195 back to opening the database directly if it can't connect to an
196 ovdb_server. Default is 0, which means an unlimited number of
197 connections is allowed.
198
200 The ovdb storage method has the ability to compress overview data
201 before it is stored into the database. In addition to consuming less
202 disk space, compression keeps the average size of the database keys
203 smaller. This in turn increases the average number of keys per page,
204 which can significantly improve performance and also helps keep the
205 database more compact. This feature requires that INN be built with
206 zlib. Only records larger than 600 bytes get compressed, because that
207 is the point at which compression starts to become significant.
208
209 If compression is not enabled (either from the compress option in
210 ovdb.conf or INN was not built with zlib support), the database will be
211 backward compatible with older versions of ovdb. However, if
212 compression is enabled, the database is marked with a newer version
213 that will prevent older versions of ovdb from opening the database.
214
215 You can upgrade an existing database to use compression simply by
216 setting compress to true in ovdb.conf. Note that existing records in
217 the database will remain uncompressed; only new records added after
218 enabling compression will be compressed.
219
220 If you disable compression on a database that previously had it
221 enabled, new records will be stored uncompressed, but the database will
222 still be incompatible with older versions of ovdb (and will also be
223 incompatible with this version of ovdb if INN was not built with zlib
224 support). So to downgrade to a completely uncompressed database, you
225 will have to rebuild the database using makehistory.
226
228 A file called DB_CONFIG may be placed in the database directory
229 (pathoverview in inn.conf) to customize where the various database
230 files and transaction logs are written. By default, all of the files
231 are written in the "DB_HOME" directory. One way to improve performance
232 is to put the transaction logs on a different disk. To do this, put:
233
234 DB_LOG_DIR /path/to/logs
235
236 in the DB_CONFIG file. If the pathname you give starts with a "/", it
237 is treated as an absolute path; otherwise, it is relative to the
238 "DB_HOME" directory. Make sure that any directories you specify exist
239 and have proper ownership/mode before starting INN, because they won't
240 be created automatically. Also, don't change the DB_CONFIG file while
241 anything that uses ovdb is running.
242
243 Another thing that you can do with this file is to split the overview
244 database across multiple disks. In the DB_CONFIG file, you can list
245 directories that Berkeley DB will search when it goes to open a
246 database.
247
248 For example, let's say that you have pathoverview set to /mnt/overview
249 and you have four additional file systems created on /mnt/ovX. You
250 would create a file /mnt/overview/DB_CONFIG containing the following
251 lines:
252
253 set_data_dir /mnt/overview
254 set_data_dir /mnt/ov1
255 set_data_dir /mnt/ov2
256 set_data_dir /mnt/ov3
257 set_data_dir /mnt/ov4
258
259 Distribute your ovNNNNN files into the four filesystems (say, 8 each).
260 When called upon to open a database file, the db library will look for
261 it in each of the specified directories (in order). If said file is
262 not found, one will be created in the first of those directories.
263
264 Whenever you change DB_CONFIG or move database files around, make sure
265 all news processes that use the database are shut down first (including
266 nnrpd processes).
267
268 The DB_CONFIG functionality is part of Berkeley DB itself, rather than
269 something provided by ovdb. See the Berkeley DB documentation for
270 complete details for the version of Berkeley DB that you're running.
271
273 When starting the news system, rc.news will invoke the ovdb_init
274 program. See the ovdb_init(8) man page for information about the tasks
275 it performs. ovdb_init must be run before using the database.
276
277 And when stopping INN, rc.news kills the ovdb_monitor processes after
278 the other INN processes have been shut down.
279
281 Problems relating to ovdb are logged to news.err with "OVDB" in the
282 error message.
283
284 INN programs that use overview will fail to start up if the
285 ovdb_monitor processes aren't running. Be sure to run ovdb_init before
286 running anything that accesses overview.
287
288 Also, INN programs that use overview will fail to start up if the user
289 running them is not the news user.
290
291 If a program accessing the database crashes, or otherwise exits
292 uncleanly, it might leave a stale lock in the database. This lock
293 could cause other processes to deadlock on that stale lock. To fix
294 this, shut down all news processes (using "kill -9" if necessary) and
295 then restart. ovdb_init should perform a recovery operation which will
296 remove the locks and repair damage caused by killing the deadlocked
297 processes.
298
300 pathetc/inn.conf
301 The ovmethod and pathoverview parameters are relevant to ovdb.
302
303 pathetc/ovdb.conf
304 Optional configuration file for tuning. See CONFIGURATION above.
305
306 pathoverview
307 Directory where the database goes. Berkeley DB calls it the
308 "DB_HOME" directory.
309
310 pathoverview/DB_CONFIG
311 Optional file to configure the layout of the database files.
312
313 pathrun/ovdb.sem
314 A file that gets locked by every process that is accessing the
315 database. This is used by ovdb_init to determine whether the
316 database is active or quiescent.
317
318 pathrun/ovdb_monitor.pid
319 Contains the process ID of ovdb_monitor.
320
322 Implement a way to limit how many databases can be open at once (to
323 reduce file descriptor usage); maybe using something similar to the
324 cache code in legacy ov3.c file.
325
327 Written by Heath Kehoe <hakehoe@avalon.net> for InterNetNews.
328
329 $Id: ovdb.pod 10241 2018-02-04 15:38:19Z iulius $
330
332 inn.conf(5), innd(8), makehistory(8), nnrpd(8), ovdb_init(8),
333 ovdb_monitor(8), ovdb_stat(8).
334
335 Berkeley DB documentation: in the docs directory of the Berkeley DB
336 source distribution, or on the Oracle Berkeley DB web page
337 (<http://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html>).
338
339
340
341INN 2.6.3 2018-03-18 OVDB(5)