1INNFEED(8) InterNetNews Documentation INNFEED(8)
2
3
4
6 innfeed, imapfeed - Multi-host, multi-connection, streaming NNTP feeder
7
9 innfeed [-ChmMvxyz] [-a spool-dir] [-b directory] [-c config-file] [-d
10 log-level] [-e bytes] [-l logfile] [-o bytes] [-p pid-file] [-s
11 command] [-S status-file] [file]
12
14 innfeed implements the NNTP protocol for transferring news between
15 computers. It handles the standard IHAVE protocol as well as the
16 CHECK/TAKETHIS streaming extension. innfeed can feed any number of
17 remote hosts at once and will open multiple connections to each host if
18 configured to do so. The only limitations are the process limits for
19 open file descriptors and memory.
20
21 As an alternative to using NNTP, INN may also be fed to an IMAP server.
22 This is done by using an executable called imapfeed, which is identical
23 to innfeed except for the delivery process. The new version has two
24 types of connections: an LMTP connection to deliver regular messages
25 and an IMAP connection to handle control messages.
26
28 innfeed has three modes of operation: channel, funnel-file and batch.
29
30 Channel mode is used when no filename is given on the command line, the
31 input-file keyword is not given in the config file, and the -x option
32 is not given. In channel mode, innfeed runs with stdin connected via a
33 pipe to innd. Whenever innd closes this pipe (and it has several
34 reasons during normal processing to do so), innfeed will exit. It
35 first will try to finish sending all articles it was in the middle of
36 transmitting, before issuing a QUIT command. This means innfeed may
37 take a while to exit depending on how slow your peers are. It never
38 (well, almost never) just drops the connection. The recommended way to
39 restart innfeed when run in channel mode is therefore to tell innd to
40 close the pipe and spawn a new innfeed process. This can be done with
41 "ctlinnd flush feed" where feed is the name of the innfeed channel feed
42 in the newsfeeds file.
43
44 Funnel-file mode is used when a filename is given as an argument or the
45 input-file keyword is given in the config file. In funnel-file mode,
46 it reads the specified file for the same formatted information as innd
47 would give in channel mode. It is expected that innd is continually
48 writing to this file, so when innfeed reaches the end of the file, it
49 will check periodically for new information. To prevent the funnel
50 file from growing without bounds, you will need to periodically move
51 the file to the side (or simply remove it) and have innd flush the
52 file. Then, after the file is flushed by innd, you can send innfeed a
53 SIGALRM, and it too will close the file and open the new file created
54 by innd. Something like:
55
56 innfeed -p <pathrun in inn.conf>/innfeed.pid my-funnel-file &
57 while true; do
58 sleep 43200
59 rm -f my-funnel-file
60 ctlinnd flush funnel-file-site
61 kill -ALRM `cat <pathrun>/innfeed.pid`
62 done
63
64 Batch mode is used when the -x flag is used. In batch mode, innfeed
65 will ignore stdin, and will simply process any backlog created by a
66 previously running innfeed. This mode is not normally needed as
67 innfeed will take care of backlog processing.
68
70 innfeed expects a couple of things to be able to run correctly: a
71 directory where it can store backlog files and a configuration file to
72 describe which peers it should handle.
73
74 The configuration file is described in innfeed.conf(5). The -c option
75 can be used to specify a different file. For each peer (say, "foo"),
76 innfeed manages up to 4 files in the backlog directory:
77
78 • A foo.lock file, which prevents other instances of innfeed from
79 interfering with this one.
80
81 • A foo.input file which has old article information innfeed is reading
82 for re-processing.
83
84 • A foo.output file where innfeed is writing information on articles
85 that could not be processed (normally due to a slow or blocked peer).
86
87 • A foo file that is never created by innfeed, but if innfeed notices
88 it, it will rename it to foo.input at the next opportunity and will
89 start reading from it. This lets you create a batch file and put it
90 in a place where innfeed will find it.
91
92 You should never alter the foo.input or foo.output files of a running
93 innfeed. The format of these last three files is one of the following:
94
95 /path/to/article <message-id>
96 @token@ <message-id>
97
98 This is the same as the first two fields of the lines innd feeds to
99 innfeed, and the same as the first two fields of the lines of the batch
100 file innd will write if innfeed is unavailable for some reason. When
101 innfeed processes its own batch files, it ignores everything after the
102 first two whitespace separated fields, so moving the innd-created batch
103 file to the appropriate spot will work, even though the lines have
104 extra fields.
105
106 The first field can also be a storage API token. The two types of
107 lines can be intermingled; innfeed will use the storage manager if
108 appropriate, and otherwise treat the first field as a filename to read
109 directly.
110
111 innfeed writes its current status to the file innfeed.status (or the
112 file given by the -S option). This file contains details on the
113 process as a whole, and on each peer this instance of innfeed is
114 managing.
115
116 If innfeed is told to send an article to a host it is not managing,
117 then the article information will be put into a file matching the
118 pattern innfeed-dropped.*, with part of the file name matching the pid
119 of the innfeed process that is writing to it. innfeed will not process
120 this file except to write to it. If nothing is written to the file,
121 then it will be removed if innfeed exits normally. Otherwise, the file
122 remains, and procbatch can be invoked to process it afterwards.
123
125 Upon receipt of a SIGALRM, innfeed will close the funnel file specified
126 on the command line, and will reopen it (see funnel file description
127 above).
128
129 innfeed with catch SIGINT and will write a large debugging snapshot of
130 the state of the running system.
131
132 innfeed will catch SIGHUP and will reload both the config and the log
133 files. See innfeed.conf(5) for more details.
134
135 innfeed will catch SIGCHLD and will close and reopen all backlog files.
136
137 innfeed will catch SIGTERM and will do an orderly shutdown.
138
139 Upon receipt of a SIGUSR1, innfeed will increment the debugging level
140 by one; receipt of a SIGUSR2 will decrement it by one. The debugging
141 level starts at zero (unless the -d option it used), in which case no
142 debugging information is emitted. A larger value for the level means
143 more debugging information. Numbers up to 5 are currently useful.
144
146 There are 3 different categories of syslog entries for statistics:
147 host, connection and global.
148
149 The host statistics are generated for a given peer at regular intervals
150 after the first connection is made (or, if the remote is unreachable,
151 after spooling starts). The host statistics give totals over all
152 connections that have been active during the given time frame. For
153 example (broken here to fit the page, with "vixie" being the peer):
154
155 May 23 12:49:08 news innfeed[16015]: vixie checkpoint
156 seconds 1381 offered 2744 accepted 1286 refused 1021
157 rejected 437 missing 0 accsize 8506220 rejsize 142129
158 spooled 990 on_close 0 unspooled 240 deferred 10/15.3
159 requeued 25 queue 42.1/100:14,35,13,4,24,10
160
161 The meanings of these fields are:
162
163 seconds
164 The time since innfeed connected to the host or since the statistics
165 were reset by a "final" log entry.
166
167 offered
168 The number of IHAVE commands sent to the host if it is not in
169 streaming mode. The sum of the number of TAKETHIS commands sent when
170 no-CHECK mode is in effect plus the number of CHECK commands sent in
171 streaming mode (when no-CHECK mode is not in effect).
172
173 accepted
174 The number of articles which were sent to the remote host and
175 accepted by it.
176
177 refused
178 The number of articles offered to the host that it indicated it did
179 not want because it had already seen the message-ID. The remote host
180 indicates this by sending a 435 response to an IHAVE command or a 438
181 response to a CHECK command.
182
183 rejected
184 The number of articles transferred to the host that it did not accept
185 because it determined either that it already had the article or it
186 did not want it because of the article's Newsgroups or Distribution
187 header fields, etc. The remote host indicates that it is rejecting
188 the article by sending a 437 or 439 response after innfeed sent the
189 entire article.
190
191 missing
192 The number of articles which innfeed was told to offer to the host
193 but which were not present in the article spool. These articles were
194 probably cancelled or expired before innfeed was able to offer them
195 to the host.
196
197 accsize
198 The number of bytes of all accepted articles transferred to the host.
199
200 rejsize
201 The number of bytes of all rejected articles transferred to the host.
202
203 spooled
204 The number of article entries that were written to the .output
205 backlog file because the articles either could not be sent to the
206 host or were refused by it. Articles are generally spooled either
207 because new articles are arriving more quickly than they can be
208 offered to the host, or because innfeed closed all the connections to
209 the host and pushed all the articles currently in progress to the
210 .output backlog file.
211
212 on_close
213 The number of articles that were spooled when innfeed closed all the
214 connections to the host.
215
216 unspooled
217 The number of article entries that were read from the .input backlog
218 file.
219
220 deferred
221 The first number is the number of articles that the host told innfeed
222 to retry later by sending a 431 or 436 response. innfeed immediately
223 puts these articles back on the tail of the queue.
224
225 The second number is the average (mean) size of deferred articles
226 during the previous logging interval
227
228 requeued
229 The number of articles that were in progress on connections when
230 innfeed dropped those connections and put the articles back on the
231 queue. These connections may have been broken by a network problem
232 or became unresponsive causing innfeed to time them out.
233
234 queue
235 The first number is the average (mean) queue size during the previous
236 logging interval. The second number is the maximum allowable queue
237 size. The third number is the percentage of the time that the queue
238 was empty. The fourth through seventh numbers are the percentages of
239 the time that the queue was >0% to 25% full, 25% to 50% full, 50% to
240 75% full, and 75% to <100% full. The last number is the percentage
241 of the time that the queue was totally full.
242
243 If the -z option is used (see below), then when the peer stats are
244 generated, each connection will log its stats too. For example, for
245 connection number zero (from a set of five):
246
247 May 23 12:49:08 news innfeed[16015]: vixie:0 checkpoint
248 seconds 1381 offered 596 accepted 274 refused 225
249 rejected 97 accsize 773623 rejsize 86591
250
251 If you only open a maximum of one connection to a remote, then there
252 will be a close correlation between connection numbers and host
253 numbers, but in general you cannot tie the two sets of number together
254 in any easy or very meaningful way. When a connection closes, it will
255 always log its stats.
256
257 If all connections for a host get closed together, then the host logs
258 its stats as "final" and resets its counters. If the feed is so busy
259 that there is always at least one connection open and running, then
260 after some amount of time (set via the config file), the host stats are
261 logged as final and reset. This is to make generating higher level
262 stats from log files, by other programs, easier.
263
264 There is one log entry that is emitted for a host just after its last
265 connection closes and innfeed is preparing to exit. This entry
266 contains counts over the entire life of the process. The "seconds"
267 field is from the first time a connection was successfully built, or
268 the first time spooling started. If a host has been completely idle,
269 it will have no such log entry.
270
271 May 23 12:49:08 news innfeed[16015]: decwrl global
272 seconds 1381 offered 34 accepted 22 refused 3 rejected 7
273 missing 0 accsize 81277 rejsize 12738 spooled 0 unspooled 0
274
275 The final log entry is emitted immediately before exiting. It contains
276 a summary of the statistics over the entire life of the process.
277
278 Feb 13 14:43:41 news innfeed[22344]: ME global
279 seconds 15742 offered 273441 accepted 45750 refused 222008
280 rejected 3334 missing 217 accsize 93647166 rejsize 7421839
281 spooled 10 unspooled 0
282
284 innfeed takes the following options.
285
286 -a spool-dir
287 The -a flag is used to specify the top of the article spool tree.
288 innfeed does a chdir(2) to this directory, so it should probably be
289 an absolute path. The default is patharticles as set in inn.conf.
290
291 -b directory
292 The -b flag may be used to specify a different directory for
293 backlog file storage and retrieval, as well as for lock files. If
294 the path is relative, then it is relative to pathspool as set in
295 inn.conf. The default is "innfeed".
296
297 -c config-file
298 The -c flag may be used to specify a different config file from the
299 default value. If the path is relative, then it is relative to
300 pathetc as set in inn.conf. The default is innfeed.conf.
301
302 -C The -C flag is used to have innfeed simply check the config file,
303 report on any errors and then exit.
304
305 -d log-level
306 The -d flag may be used to specify the initial logging level. All
307 debugging messages go to stderr (which may not be what you want,
308 see the -l flag below).
309
310 -e bytes
311 The -e flag may be used to specify the size limit (in bytes) for
312 the .output backlog files innfeed creates. If the output file gets
313 bigger than 10% more than the given number, innfeed will replace
314 the output file with the tail of the original version. The default
315 value is 0, which means there is no limit.
316
317 -h Use the -h flag to print the usage message.
318
319 -l logfile
320 The -l flag may be used to specify a different log file from
321 stderr. As innd starts innfeed with stderr attached to /dev/null,
322 using this option can be useful in catching any abnormal error
323 messages, or any debugging messages (all "normal" errors messages
324 go to syslog).
325
326 -m The -m flag is used to turn on logging of all missing articles.
327 Normally, if an article is missing, innfeed keeps a count, but logs
328 no further information. When this flag is used, details about
329 message-IDs and expected path names are logged.
330
331 -M If innfeed has been built with mmap support, then the -M flag turns
332 OFF the use of mmap(); otherwise, it has no effect.
333
334 -o bytes
335 The -o flag sets a value of the maximum number of bytes of article
336 data innfeed is supposed to keep in memory. This does not work
337 properly yet.
338
339 -p pid-file
340 The -p flag is used to specify the file name to write the pid of
341 the process into. A relative path is relative to pathrun as set in
342 inn.conf. The default is innfeed.pid.
343
344 -s command
345 The -s flag specifies the name of a command to run in a subprocess
346 and read article information from. This is similar to channel mode
347 operation, only that command takes the place usually occupied by
348 innd.
349
350 -S status-file
351 The -S flag specifies the name of the file to write the periodic
352 status to. If the path is relative, it is considered relative to
353 pathlog as set in inn.conf. The default is innfeed.status.
354
355 -v When the -v flag is given, version information is printed to stderr
356 and then innfeed exits.
357
358 -x The -x flag is used to tell innfeed not to expect any article
359 information from innd but just to process any backlog files that
360 exist and then exit.
361
362 -y The -y flag is used to allow dynamic peer binding. If this flag is
363 used and article information is received from innd that specifies
364 an unknown peer, then the peer name is taken to be the IP name too,
365 and an association with it is created. Using this, it is possible
366 to only have the global defaults in the innfeed.conf file, provided
367 the peer name as used by innd is the same as the IP name.
368
369 Note that innfeed with -y and no peer in innfeed.conf would cause a
370 problem that innfeed drops the first article.
371
372 -z The -z flag is used to cause each connection, in a parallel feed
373 configuration, to report statistics when the controller for the
374 connections prints its statistics.
375
377 When using the -x option, the config file entry's initial-connections
378 field will be the total number of connections created and used, no
379 matter how many big the batch file, and no matter how big the max-
380 connections field specifies. Thus a value of 0 for initial-connections
381 means nothing will happen in -x mode.
382
383 innfeed does not automatically grab the file out of pathoutgoing. This
384 needs to be prepared for it by external means.
385
386 Probably too many other bugs to count.
387
389 An alternative to innfeed can be innduct, maintained by Ian Jackson and
390 available at
391 <http://www.chiark.greenend.org.uk/ucgi/~ian/git-manpage/innduct.git/innduct.8>.
392 It is intended to solve a design issue in the way innfeed works. As a
393 matter of fact, the program feed protocol spoken between innd and
394 innfeed is lossy: if innfeed dies unexpectedly, articles which innd has
395 written to the pipe to innfeed will be skipped. innd has no way of
396 telling which articles those are, no useful records, and no attempts to
397 resend these articles.
398
400 pathbin/innfeed
401 The binary program itself.
402
403 pathetc/innfeed.conf
404 The configuration file.
405
406 pathspool/innfeed
407 The directory for backlog files.
408
410 Written by James Brister <brister@vix.com> for InterNetNews. Converted
411 to POD by Julien Elie.
412
413 Earlier versions of innfeed (up to 0.10.1) were shipped separately;
414 innfeed is now part of INN and shares the same version number.
415
417 ctlinnd(8), inn.conf(5), innfeed.conf(5), innd(8), procbatch(8).
418
419
420
421INN 2.7.1 2023-03-07 INNFEED(8)