1PULLNEWS(1)               InterNetNews Documentation               PULLNEWS(1)
2
3
4

NAME

6       pullnews - Pull news from multiple news servers and feed it to another
7

SYNOPSIS

9       pullnews [-BhnOqRx] [-a hashfeed] [-b fraction] [-c config] [-C width]
10       [-d level] [-f fraction] [-F fakehop] [-g groups] [-G newsgroups] [-H
11       headers] [-k checkpt] [-l logfile] [-L size] [-m header_pats] [-M num]
12       [-N timeout] [-p port] [-P hop_limit] [-Q level] [-r file] [-s to-
13       server[:port][_tlsmode]] [-S max-run] [-t retries] [-T connect-pause]
14       [-w num] [-z article-pause] [-Z group-pause] [from-server ...]
15

REQUIREMENTS

17       The "Net::NNTP" module must be installed.  This module is available as
18       part of the libnet distribution and comes with recent versions of Perl.
19       For older versions of Perl, you can download it from
20       <http://www.cpan.org/>.
21

DESCRIPTION

23       pullnews reads a config file named pullnews.marks, and connects to the
24       upstream servers given there as a reader client.  This file is looked
25       for in pathdb when pullnews is run as the user set in runasuser in
26       inn.conf (which is by default the "news" user); otherwise, this file is
27       looked for in the running user's home directory.
28
29       By default, pullnews connects to all servers listed in the
30       configuration file, but you can limit pullnews to specific servers by
31       listing them on the command line: a whitespace-separated list of server
32       names can be specified, like from-server for one of them.  For each
33       server it connects to, it pulls over articles and feeds them to the
34       destination server via the IHAVE or POST commands.  This means that the
35       system pullnews is run on must have feeding access to the destination
36       news server.
37
38       pullnews is designed for very small sites that do not want to bother
39       setting up traditional peering and is not meant for handling large
40       feeds.
41

OPTIONS

43       -a hashfeed
44           This option is a deterministic way to control the flow of articles
45           and to split a feed.  The hashfeed parameter must be in the form
46           "value/mod" or "start-end/mod".  The Message-ID of each article is
47           hashed using MD5, which results in a 128-bit hash.  The lowest
48           32 bits are then taken by default as the hashfeed value (which is
49           an integer).  If the hashfeed value modulus "mod" plus one equals
50           "value" or is between "start" and "end", pullnews will feed the
51           article.  All these numbers must be integers.
52
53           For instance:
54
55               pullnews -a 1/2      Feeds about 50% of all articles.
56               pullnews -a 2/2      Feeds the other 50% of all articles.
57
58           Another example:
59
60               pullnews -a 1-3/10   Feeds about 30% of all articles.
61               pullnews -a 4-5/10   Feeds about 20% of all articles.
62               pullnews -a 6-10/10  Feeds about 50% of all articles.
63
64           You can use an extended syntax of the form "value/mod:offset" or
65           "start-end/mod:offset" (using an underscore "_" instead of a colon
66           ":" is also recognized).  As MD5 generates a 128-bit return value,
67           it is possible to specify from which byte-offset the 32-bit integer
68           used by hashfeed starts.  The default value for "offset" is ":0"
69           and thirteen overlapping values from ":0" to ":12" can be used.
70           Only up to four totally independent values exist: ":0", ":4", ":8"
71           and ":12".
72
73           Therefore, it allows generating a second level of deterministic
74           distribution.  Indeed, if pullnews feeds "1/2", it can go on
75           splitting thanks to "1-3/9:4" for instance.  Up to four levels of
76           deterministic distribution can be used.
77
78           The algorithm is compatible with the one used by Diablo 5.1 and up.
79
80       -b fraction
81           Backtrack on server numbering reset.  Specify the proportion (0.0
82           to 1.0) of a group's articles to pull when the server's article
83           number is less than our high for that group.  When fraction is 1.0,
84           pull all the articles on a renumbered server.  The default is to do
85           nothing.
86
87       -B  Feed is header-only, that is to say pullnews only feeds the headers
88           of the articles, plus one blank line.  It adds the Bytes header
89           field if the article does not already have one, and keeps the body
90           only if the article is a control article.
91
92       -c config
93           Normally, the config file is stored in pullnews.marks in pathdb
94           when pullnews is run as the news user, or otherwise in the running
95           user's home directory.  If -c is given, config will be used as the
96           config file instead.  This is useful if you're running pullnews as
97           a system user on an automated basis out of cron or as an individual
98           user, rather than the news user.
99
100           See "CONFIG FILE" below for the format of this file.
101
102       -C width
103           Use width characters per line for the progress table.  The default
104           value is 50.
105
106       -d level
107           Set the debugging level to the integer level (up to 4); more
108           debugging output will be logged as this increases.  The default
109           value is 0.
110
111       -f fraction
112           This changes the proportion of articles to get from each group to
113           fraction and should be in the range 0.0 to 1.0 (1.0 being the
114           default).
115
116       -F fakehop
117           Prepend fakehop as a host to the Path header field body of articles
118           fed.
119
120       -g groups
121           Specify a collection of groups to get.  groups is a list of
122           newsgroups separated by commas (only commas, no spaces).  Each
123           group must be defined in the config file, and only the remote hosts
124           that carry those groups will be contacted.  Note that this is a
125           simple list of groups, not a wildmat expression, and wildcards are
126           not supported.
127
128       -G newsgroups
129           Add the comma-separated list of groups newsgroups to each server in
130           the configuration file (see also -g and -w).
131
132       -h  Print a usage message and exit.
133
134       -H headers
135           Remove these named header fields (colon-separated list) from fed
136           articles.
137
138       -k checkpt
139           Checkpoint (save) the config file every checkpt articles (default
140           is 0, that is to say at the end of the session).
141
142       -l logfile
143           Log progress/stats to logfile (default is "stdout").
144
145       -L size
146           Specify the largest wanted article size in bytes.  The default is
147           to download all articles, whatever their size.  When this option is
148           used, pullnews will first retrieve overview data (if available) of
149           each newsgroup to process so as to obtain articles sizes, before
150           deciding which articles to actually download.
151
152       -m header_pats
153           Feed an article based on header field body matching.  The argument
154           is a number of whitespace-separated tuples (each tuple being a
155           colon-separated header field name and regular expression).  For
156           instance:
157
158               -m "Hdr1:regexp1 !Hdr2:regexp2 #Hdr3:regexp3 !#Hdr4:regexp4"
159
160           specifies that the article will be passed only if the "Hdr1" header
161           field body matches "regexp1" and the "Hdr2" header field body does
162           not match "regexp2".  Besides, if the "Hdr3" header field body
163           matches "regexp3", that header is removed; and if the "Hdr4" header
164           field body does not match "regexp4", that header is removed.
165
166       -M num
167           Specify the maximum number of articles (per group) to process.  The
168           default is to process all new articles.  See also -f.
169
170       -n  Do nothing but read articles -- does not feed articles downstream,
171           writes no rnews file, does not update the config file.
172
173       -N timeout
174           Specify the timeout length, as timeout seconds, when establishing
175           an NNTP connection.
176
177       -O  Use an optimized mode: pullnews checks whether the article already
178           exists on the downstream server, before downloading it.  It may
179           help for huge articles or a slow link to upstream hosts.
180
181       -p port
182           Connect to the destination news server on a port other than the
183           default of 119.  This option does not change the port used to
184           connect to the source news servers.
185
186       -P hop_limit
187           Restrict feeding an article based on the number of hops it has
188           already made.  Count the hops in the Path header field body
189           (hop_count), feeding the article only when hop_limit is "+num" and
190           hop_count is more than num; or hop_limit is "-num" and hop_count is
191           less than num.
192
193       -q  Print out less status information while running.
194
195       -Q level
196           Set the quietness level ("-Q 2" is equivalent to "-q").  The higher
197           this value, the less gets logged.  The default is 0.
198
199       -r file
200           Rather than feeding the downloaded articles to a destination
201           server, instead create a batch file that can later be fed to a
202           server using rnews.  See rnews(1) for more information about the
203           batch file format.
204
205       -R  Be a reader (use MODE READER and POST commands) to the downstream
206           server.  Some posts will then be rejected because of unexpected
207           injection header fields, obsolete or incorrectly formatted header
208           fields, or with a date too far in the past.  You may then want to
209           set artcutoff to 0 in inn.conf, and use the -H flag to strip
210           unwanted header fields.  Even with that, a few articles may still
211           be rejected.
212
213           The default is to behave like a feeder and use the IHAVE command.
214           (You'll have to allow in incoming.conf the connections from
215           pullnews so that it is recognized as a feeder.)
216
217       -s to-server[:port][_tlsmode]
218           Normally, pullnews will feed the articles it retrieves to the news
219           server running on localhost.  To connect to a different host,
220           specify a server with the -s flag.  You can also specify the port
221           with this same flag or use -p.  Default port is 119.
222
223           The connection is by default unencrypted.  To negotiate a TLS
224           encryption layer, you can set tlsmode to "TLS" for implicit TLS
225           (negotiated immediately upon connection on a dedicated port) or
226           "STARTTLS" for explicit TLS (the appropriate command will be sent
227           before authenticating or feeding messages).  Examples of use are:
228
229               pullnews -s news.server.com
230               pullnews -s news.server.com_STARTTLS
231               pullnews -s news.server.com:433_TLS
232
233           Note that not all NNTP servers implement TLS for feeding articles.
234
235       -S max-run
236           Specify the maximum time max-run in seconds for pullnews to run.
237
238       -t retries
239           The maximum number (retries) of attempts to connect to a server or
240           reconnect to a server if the socket is unexpectedly closed (see
241           also -T).  The default is 0.
242
243       -T connect-pause
244           Pause connect-pause seconds between connection retries (see also
245           -t).  The default is 1.
246
247       -w num
248           Set each group's high water mark (last received article number) to
249           num.  If num is negative, calculate Current+num instead (i.e. get
250           the last num articles).  Therefore, a num of 0 will re-get all
251           articles on the server; whereas a num of "-0" will get no old
252           articles, setting the water mark to Current (the most recent
253           article on the server).
254
255       -x  If the -x flag is used, an Xref header field is added to any
256           article that lacks one.  It can be useful for instance if articles
257           are fed to a news server which has xrefslave set in inn.conf.
258
259       -z article-pause
260           Sleep article-pause seconds between articles.  The default is 0.
261
262       -Z group-pause
263           Sleep group-pause seconds between groups.  The default is 0.
264

CONFIG FILE

266       The config file for pullnews is divided into blocks, one block for each
267       remote server to connect to.  A block begins with the host line (which
268       must have no leading whitespace) and contains just the hostname of the
269       remote server with optional port and TLS mode (with the same semantics
270       as the -s flag), optionally followed by authentication details
271       (username and password for that server).  Note that authentication
272       details can also be provided for the downstream server (a host line for
273       "localhost" or the hostname specified with the -s flag could be added
274       for it in the configuration file, with no newsgroup to fetch).
275
276       Following the host line should be one or more newsgroup lines which
277       start with whitespace followed by the name of a newsgroup to retrieve.
278       Only one newsgroup should be listed on each line.
279
280       pullnews will update the config file to include the time the group was
281       last checked and the highest numbered article successfully retrieved
282       and transferred to the destination server.  It uses this data to avoid
283       doing duplicate work the next time it runs.
284
285       The full syntax is:
286
287           <host>[:<port>][_<tlsmode>] [<username> <password>]
288               <group> [<time> <high>]
289               <group> [<time> <high>]
290
291       where the <host> line must not have leading whitespace and the <group>
292       lines must.
293
294       A typical configuration file would be:
295
296           # Format: group date high
297           data.pa.vix.com
298               rec.bicycles.racing 908086612 783
299               rec.humor.funny 908086613 18
300               comp.programming.threads
301           nnrp.vix.com pull sekret
302               comp.std.lisp
303           news.server.com:563_TLS joe password
304               news.software.nntp
305
306       Note that an earlier run of pullnews has filled in details about the
307       last article downloads from the two rec.* groups.  The two comp.*
308       groups and the news.* group were just added by the user and have not
309       yet been checked.
310
311       The nnrp.vix.com server requires authentication, and pullnews will use
312       the username "pull" and the password "sekret" (without any encryption
313       layer).
314
315       The connection to news.server.com will be encrypted with implicit TLS
316       on port 563.  Joe's password won't be sent in plaintext.
317

FILES

319       pathbin/pullnews
320           The Perl script itself used to pull news from upstream servers and
321           feed it to another news server.
322
323       pathdb/pullnews.marks or ~/pullnews.marks
324           The default config file.  It is stored in pullnews.marks in pathdb
325           when pullnews is run as the news user, or otherwise in the running
326           user's home directory.
327

HISTORY

329       pullnews was written by James Brister for INN.  The documentation was
330       rewritten in POD by Russ Allbery <eagle@eyrie.org>.
331
332       Geraint A. Edwards greatly improved pullnews, adding no more than
333       16 new recognized flags, fixing some bugs and integrating the
334       backupfeed contrib script by Kai Henningsen, adding again 6 other
335       flags.
336

SEE ALSO

338       incoming.conf(5), rnews(1).
339
340
341
342INN 2.7.1                         2023-04-16                       PULLNEWS(1)
Impressum