1STORAGE.CONF(5)           InterNetNews Documentation           STORAGE.CONF(5)
2
3
4

NAME

6       storage.conf - Configuration file for storage manager
7

DESCRIPTION

9       The file pathetc/storage.conf contains the rules to be used in
10       assigning articles to different storage methods.  These rules determine
11       where incoming articles will be stored.
12
13       The storage manager is a unified interface between INN and a variety of
14       different storage methods, allowing the news administrator to choose
15       between different storage methods with different trade-offs (or even
16       use several at the same time for different newsgroups, or articles of
17       different sizes).  The rest of INN need not care what type of storage
18       method was used for a given article; the storage manager will figure
19       this out automatically when that article is retrieved via the storage
20       API.  Note that you may also want to see the options provided in
21       inn.conf(5) regarding article storage.
22
23       The storage.conf file consists of a series of storage method entries.
24       Blank lines and lines beginning with a number sign ("#") are ignored.
25       The maximum number of characters in each line is 255.  The order of
26       entries in this file is important, see below.
27
28       Each entry specifies a storage method and a set of rules.  Articles
29       which match all of the rules of a storage method entry will be stored
30       using that storage method; if an article matches multiple storage
31       method entries, the first one will be used.  Each entry is formatted as
32       follows:
33
34           method <methodname> {
35               class: <storage_class>
36               newsgroups: <wildmat>
37               size: <minsize>[,<maxsize>]
38               expires: <mintime>[,<maxtime>]
39               options: <options>
40               exactmatch: <bool>
41           }
42
43       If spaces or tabs are included in a value, that value must be enclosed
44       in double quotes ("").  If either a number sign ("#") or a double quote
45       are meant to be included verbatim in a value, they should be escaped
46       with "\".
47
48       <methodname> is the name of a storage method to use for articles which
49       match the rules of this entry.  The currently available storage methods
50       are:
51
52           cnfs
53           timecaf
54           timehash
55           tradspool
56           trash
57
58       See the "STORAGE METHODS" section below for more details.
59
60       The meanings of the keys in each storage method entry are as follows:
61
62       class: <storage_class>
63           An identifier for this storage method entry.  <storage_class>
64           should be a number between 0 and 255.  It should be unique across
65           all of the entries in this file.  It is mainly used for specifying
66           expiration times by storage class as described in expire.ctl(5);
67           "timehash" and "timecaf" will also set the top-level directory in
68           which articles accepted by this storage class are stored.  The
69           assignment of a particular number to a storage class is arbitrary
70           but permanent (since it is used in storage tokens).  Storage
71           classes can be for instance numbered sequentially in storage.conf.
72
73       newsgroups: <wildmat>
74           What newsgroups are stored using this storage method.  <wildmat> is
75           a uwildmat(3) pattern which is matched against the newsgroups an
76           article is posted to.  If storeonxref in inn.conf is true, this
77           pattern will be matched against the newsgroup names in the Xref:
78           header; otherwise, it will be matched against the newsgroup names
79           in the Newsgroups: header (see inn.conf(5) for discussion of the
80           differences between these possibilities).  Poison wildmat
81           expressions (expressions starting with "@") are allowed and can be
82           used to exclude certain group patterns:  articles crossposted to
83           poisoned newsgroups will not be stored using this storage method.
84           The <wildmat> pattern is matched in order.
85
86           There is no default newsgroups pattern; if an entry should match
87           all newsgroups, use an explicit "newsgroups: *".
88
89       size: <minsize>[,<maxsize>]
90           A range of article sizes (in bytes) which should be stored using
91           this storage method.  If <maxsize> is 0 or not given, the upper
92           size of articles is limited only by maxartsize in inn.conf.  The
93           size: field is optional and may be omitted entirely if you want
94           articles of any size to be stored in this storage method (if, of
95           course, these articles fulfill all the other requirements of this
96           storage method entry).  By default, <minsize> is set to 0.
97
98       expires: <mintime>[,<maxtime>]
99           A range of article expiration times which should be stored using
100           this storage method.  Be careful; this is less useful than it may
101           appear at first.  This is based only on the Expires: header of the
102           article, not on any local expiration policies or anything in
103           expire.ctl!  If <mintime> is non-zero, then this entry will not
104           match any article without an Expires: header.  This key is
105           therefore only really useful for assigning articles with requested
106           longer expire times to a separate storage method.  Articles only
107           match if the time until expiration (that is to say, the amount of
108           time into the future that the Expires: header of the article
109           requests that it remain around) falls in the interval specified by
110           <mintime> and <maxtime>.
111
112           The format of these parameters is "0d0h0m0s" (days, hours, minutes,
113           and seconds into the future).  If <maxtime> is "0s" or is not
114           specified, there is no upper bound on expire times falling into
115           this entry (note that this key has no effect on when the article
116           will actually be expired, but only on whether or not the article
117           will be stored using this storage method).  This field is also
118           optional and may be omitted entirely if you do not want to store
119           articles according to their Expires: header, if any.
120
121           A <mintime> value greater than "0s" implies that this storage
122           method won't match any article without an Expires: header.
123
124       options: <options>
125           This key is for passing special options to storage methods that
126           require them (currently only "cnfs").  See the "STORAGE METHODS"
127           section below for a description of its use.
128
129       exactmatch: <bool>
130           If this key is set to true, all the newsgroups in the Newsgroups:
131           header of incoming articles will be examined to see if they match
132           newsgroups patterns.  (Normally, any non-zero number of matching
133           newsgroups is sufficient, provided no newsgroup matches a poison
134           wildmat as described above.)  This is a boolean value; "true",
135           "yes" and "on" are usable to enable this key.  The case of these
136           values is not significant.  The default is false.
137
138       If an article matches all of the constraints of an entry, it is stored
139       via that storage method and is associated with that <storage_class>.
140       This file is scanned in order and the first matching entry is used to
141       store the article.
142
143       If an article does not match any entry, either by being posted to a
144       newsgroup which does not match any of the <wildmat> patterns or by
145       being outside the size and expires ranges of all entries whose
146       newsgroups pattern it does match, the article is not stored and is
147       rejected by innd.  When this happens, the error message:
148
149           cant store article: no matching entry in storage.conf
150
151       is logged to syslog.  If you want to silently drop articles matching
152       certain newsgroup patterns or size or expires ranges, assign them to
153       the "trash" storage method rather than having them not match any
154       storage method entry.
155

STORAGE METHODS

157       Currently, there are five storage methods available.  Each method has
158       its pros and cons; you can choose any mixture of them as is suitable
159       for your environment.  Note that each method has an attribute
160       EXPENSIVESTAT which indicates whether checking the existence of an
161       article is expensive or not.  This is used to run expireover(8).
162
163       cnfs
164           The "cnfs" storage method stores articles in large cyclic buffers
165           (CNFS stands for Cyclic News File System).  Articles are stored in
166           CNFS buffers in arrival order, and when the buffer fills, it wraps
167           around to the beginning and stores new articles over the top of the
168           oldest articles in the buffer. The expire time of articles stored
169           in CNFS buffers is therefore entirely determined by how long it
170           takes the buffer to wrap around, which depends on how quickly data
171           is being stored in it.  (This method is therefore said to have
172           self-expire functionality.  It also means that when an article is
173           cancelled, the cycbuff doesn't go back and use space until it rolls
174           over and the whole cycbuff starts being reused.)  EXPENSIVESTAT is
175           false for this method.
176
177           CNFS has its own configuration file, cycbuff.conf, which describes
178           some subtleties to the basic description given above.  Storage
179           method entries for the "cnfs" storage method must have an options:
180           field specifying the metacycbuff into which articles matching that
181           entry should be stored; see cycbuff.conf(5) for details on
182           metacycbuffs.
183
184           Advantages:  By far the fastest of all storage methods (except for
185           "trash"), since it eliminates the overhead of dealing with a file
186           system and creating new files.  Unlike all other storage methods,
187           it does not require manual article expiration.  With CNFS, the
188           server will never throttle itself due to a full spool disk, and
189           groups are restricted to just the buffer files given so that they
190           can never use more than the amount of disk space allocated to them.
191
192           Disadvantages:  Article retention times are more difficult to
193           control because old articles are overwritten automatically.
194           Attacks on Usenet, such as flooding or massive amounts of spam, can
195           result in wanted articles expiring much faster than intended (with
196           no warning).
197
198       timecaf
199           This method stores multiple articles in one file, whose name is
200           based on the article's arrival time and the storage class.  The
201           file name will be:
202
203               <patharticles>/timecaf-nn/bb/aacc.CF
204
205           where "nn" is the hexadecimal value of <storage_class>, "bb" and
206           "aacc" are the hexadecimal components of the arrival time, and "CF"
207           is a hardcoded extension.  (The arrival time, in seconds since the
208           epoch, is converted to hexadecimal and interpreted as 0xaabbccdd,
209           with "aa", "bb", and "cc" used to build the path.)  This method
210           does not have self-expire functionality (meaning expire has to run
211           periodically to delete old articles, as well as cancelled articles
212           if immediatecancel is not set to true in inn.conf).  EXPENSIVESTAT
213           is false for this method.
214
215           Advantages:  It is roughly four times faster than "timehash" for
216           article writes, since much of the file system overhead is bypassed,
217           while still retaining the same fine control over article retention
218           time.
219
220           Disadvantages:  Using this method means giving up all but the most
221           careful manually fiddling with the article spool; in this aspect,
222           it looks like "cnfs".  As one of the newer and least widely used
223           storage types, "timecaf" has not been as thoroughly tested as the
224           other methods.
225
226       timehash
227           This method is very similar to "timecaf" except that each article
228           is stored in a separate file.  The name of the file for a given
229           article will be:
230
231               <patharticles>/time-nn/bb/cc/yyyy-aadd
232
233           where "nn" is the hexadecimal value of <storage_class>, "yyyy" is a
234           hexadecimal sequence number, and "bb", "cc", and "aadd" are
235           components of the arrival time in hexadecimal (the arrival time is
236           interpreted as documented above under "timecaf").  This method does
237           not have self-expire functionality.  Cancelled articles are removed
238           immediately.  EXPENSIVESTAT is true for this method.
239
240           Advantages:  Heavy traffic groups do not cause bottlenecks, and a
241           fine control of article retention time is still possible.
242
243           Disadvantages:  The ability to easily find all articles in a given
244           newsgroup and manually fiddle with the article spool is lost, and
245           INN still suffers from speed degradation due to file system
246           overhead (creating and deleting individual files is a slow
247           operation).
248
249       tradspool
250           Traditional spool, or "tradspool", is the traditional news article
251           storage format.  Each article is stored in an individual text file
252           named:
253
254               <patharticles>/news/group/name/nnnnn
255
256           where "news/group/name" is the name of the newsgroup to which the
257           article was posted with each period changed to a slash, and "nnnnn"
258           is the sequence number of the article in that newsgroup.  For
259           crossposted articles, the article is linked into each newsgroup to
260           which it is crossposted (using either hard or symbolic links).
261           This is the way versions of INN prior to 2.0 stored all articles,
262           as well as being the article storage format used by C News and
263           earlier news systems.  This method does not have self-expire
264           functionality.  Cancelled articles are removed immediately.
265           EXPENSIVESTAT is true for this method.
266
267           Advantages:  It is widely used and well-understood; it can read
268           article spools written by older versions of INN and it is
269           compatible with all third-party INN add-ons.  This storage
270           mechanism provides easy and direct access to the articles stored on
271           the server and makes writing programs that fiddle with the news
272           spool very easy, and gives fine control over article retention
273           times.
274
275           Disadvantages:  It takes a very fast file system and I/O system to
276           keep up with current Usenet traffic volumes due to file system
277           overhead.  Groups with heavy traffic tend to create a bottleneck
278           because of inefficiencies in storing large numbers of article files
279           in a single directory.  It requires a nightly expire program to
280           delete old articles out of the news spool, a process that can slow
281           down the server for several hours or more.
282
283       trash
284           This method silently discards all articles stored in it.  Its only
285           real uses are for testing and for silently discarding articles
286           matching a particular storage method entry (for whatever reason).
287           Articles stored in this method take up no disk space and can never
288           be retrieved, so this method has self-expire functionality of a
289           sort.  EXPENSIVESTAT is false for this method.
290

EXAMPLES

292       The following sample storage.conf file would store all articles posted
293       to alt.binaries.* in the "BINARIES" CNFS metacycbuff, all articles over
294       roughly 50 KB in any other hierarchy in the "LARGE" CNFS metacycbuff,
295       all other articles in alt.* in one timehash class, and all other
296       articles in any newsgroups in a second timehash class, except for the
297       internal.* hierarchy which is stored in traditional spool format.
298
299           method tradspool {
300               class: 1
301               newsgroups: internal.*
302           }
303           method cnfs {
304               class: 2
305               newsgroups: alt.binaries.*
306               options: BINARIES
307           }
308           method cnfs {
309               class: 3
310               newsgroups: *
311               size: 50000
312               options: LARGE
313           }
314           method timehash {
315               class: 4
316               newsgroups: alt.*
317           }
318           method timehash {
319               class: 5
320               newsgroups: *
321           }
322
323       Notice that the last storage method entry will catch everything.  This
324       is a good habit to get into; make sure that you have at least one
325       catch-all entry just in case something you did not expect falls through
326       the cracks.  Notice also that the special rule for the internal.*
327       hierarchy is first, so it will catch even articles crossposted to
328       alt.binaries.* or over 50 KB in size.
329
330       As for poison wildmat expressions, if you have for instance an article
331       crossposted between misc.foo and misc.bar, the pattern:
332
333           misc.*,!misc.bar
334
335       will match that article whereas the pattern:
336
337           misc.*,@misc.bar
338
339       will not match that article.  An article posted only to misc.bar will
340       fail to match either pattern.
341
342       Usually, high-volume groups and groups whose articles do not need to be
343       kept around very long (binaries groups, *.jobs*, news.lists.filters,
344       etc.) are stored in CNFS buffers.  Use the other methods (or CNFS
345       buffers again) for everything else.  However, it is as often as not
346       most convenient to keep in "tradspool" special hierarchies like local
347       hierarchies and hierarchies that should never expire or through the
348       spool of which you need to go manually.
349

HISTORY

351       Written by Katsuhiro Kondou <kondou@nec.co.jp> for InterNetNews.
352       Rewritten into POD by Julien Elie.
353
354       $Id: storage.conf.pod 10230 2018-01-28 21:22:21Z iulius $
355

SEE ALSO

357       cycbuff.conf(5), expire.ctl(5), expireover(8), inn.conf(5), innd(8),
358       uwildmat(3).
359
360
361
362INN 2.6.3                         2018-01-28                   STORAGE.CONF(5)
Impressum