Mail::SpamAssassin::Conf(3pm)

1Mail::SpamAssassin::ConUfs(e3r)Contributed Perl DocumentMaatiilo:n:SpamAssassin::Conf(3)
2
3
4

NAME

6       Mail::SpamAssassin::Conf - SpamAssassin configuration file
7

SYNOPSIS

9         # a comment
10
11         rewrite_header Subject          *****SPAM*****
12
13         full PARA_A_2_C_OF_1618         /Paragraph .a.{0,10}2.{0,10}C. of S. 1618/i
14         describe PARA_A_2_C_OF_1618     Claims compliance with senate bill 1618
15
16         header FROM_HAS_MIXED_NUMS      From =~ /\d+[a-z]+\d+\S*@/i
17         describe FROM_HAS_MIXED_NUMS    From: contains numbers mixed in with letters
18
19         score A_HREF_TO_REMOVE          2.0
20
21         lang es describe FROM_FORGED_HOTMAIL Forzado From: simula ser de hotmail.com
22
23         lang pt_BR report O programa detetor de Spam ZOE [...]
24

DESCRIPTION

26       SpamAssassin is configured using traditional UNIX-style configuration
27       files, loaded from the "/usr/share/spamassassin" and
28       "/etc/mail/spamassassin" directories.
29
30       The following web page lists the most important configuration settings
31       used to configure SpamAssassin; novices are encouraged to read it
32       first:
33
34         http://wiki.apache.org/spamassassin/ImportantInitialConfigItems
35

FILE FORMAT

37       The "#" character starts a comment, which continues until end of line.
38       NOTE: if the "#" character is to be used as part of a rule or
39       configuration option, it must be escaped with a backslash.  i.e.: "\#"
40
41       Whitespace in the files is not significant, but please note that
42       starting a line with whitespace is deprecated, as we reserve its use
43       for multi-line rule definitions, at some point in the future.
44
45       Currently, each rule or configuration setting must fit on one-line;
46       multi-line settings are not supported yet.
47
48       File and directory paths can use "~" to refer to the user's home
49       directory, but no other shell-style path extensions such as globing or
50       "~user/" are supported.
51
52       Where appropriate below, default values are listed in parentheses.
53
54       Test names ("SYMBOLIC_TEST_NAME") can only contain
55       alphanumerics/underscores, can not start with digit, and must be less
56       than 128 characters.
57

USER PREFERENCES

59       The following options can be used in both site-wide ("local.cf") and
60       user-specific ("user_prefs") configuration files to customize how
61       SpamAssassin handles incoming email messages.
62
63   SCORING OPTIONS
64       required_score n.nn (default: 5)
65           Set the score required before a mail is considered spam.  "n.nn"
66           can be an integer or a real number.  5.0 is the default setting,
67           and is quite aggressive; it would be suitable for a single-user
68           setup, but if you're an ISP installing SpamAssassin, you should
69           probably set the default to be more conservative, like 8.0 or 10.0.
70           It is not recommended to automatically delete or discard messages
71           marked as spam, as your users will complain, but if you choose to
72           do so, only delete messages with an exceptionally high score such
73           as 15.0 or higher. This option was previously known as
74           "required_hits" and that name is still accepted, but is deprecated.
75
76       score SYMBOLIC_TEST_NAME n.nn [ n.nn n.nn n.nn ]
77           Assign scores (the number of points for a hit) to a given test.
78           Scores can be positive or negative real numbers or integers.
79           "SYMBOLIC_TEST_NAME" is the symbolic name used by SpamAssassin for
80           that test; for example, 'FROM_ENDS_IN_NUMS'.
81
82           If only one valid score is listed, then that score is always used
83           for a test.
84
85           If four valid scores are listed, then the score that is used
86           depends on how SpamAssassin is being used. The first score is used
87           when both Bayes and network tests are disabled (score set 0). The
88           second score is used when Bayes is disabled, but network tests are
89           enabled (score set 1). The third score is used when Bayes is
90           enabled and network tests are disabled (score set 2). The fourth
91           score is used when Bayes is enabled and network tests are enabled
92           (score set 3).
93
94           Setting a rule's score to 0 will disable that rule from running.
95
96           If any of the score values are surrounded by parenthesis '()', then
97           all of the scores in the line are considered to be relative to the
98           already set score.  ie: '(3)' means increase the score for this
99           rule by 3 points in all score sets.  '(3) (0) (3) (0)' means
100           increase the score for this rule by 3 in score sets 0 and 2 only.
101
102           If no score is given for a test by the end of the configuration, a
103           default score is assigned: a score of 1.0 is used for all tests,
104           except those whose names begin with 'T_' (this is used to indicate
105           a rule in testing) which receive 0.01.
106
107           Note that test names which begin with '__' are indirect rules used
108           to compose meta-match rules and can also act as prerequisites to
109           other rules.  They are not scored or listed in the 'tests hit'
110           reports, but assigning a score of 0 to an indirect rule will
111           disable it from running.
112
113   WHITELIST AND BLACKLIST OPTIONS
114       whitelist_from user@example.com
115           Used to whitelist sender addresses which send mail that is often
116           tagged (incorrectly) as spam.
117
118           Use of this setting is not recommended, since it blindly trusts the
119           message, which is routinely and easily forged by spammers and phish
120           senders. The recommended solution is to instead use
121           "whitelist_auth" or other authenticated whitelisting methods, or
122           "whitelist_from_rcvd".
123
124           Whitelist and blacklist addresses are now file-glob-style patterns,
125           so "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all
126           work.  Specifically, "*" and "?" are allowed, but all other
127           metacharacters are not. Regular expressions are not used for
128           security reasons.  Matching is case-insensitive.
129
130           Multiple addresses per line, separated by spaces, is OK.  Multiple
131           "whitelist_from" lines are also OK.
132
133           The headers checked for whitelist addresses are as follows: if
134           "Resent-From" is set, use that; otherwise check all addresses taken
135           from the following set of headers:
136
137                   Envelope-Sender
138                   Resent-Sender
139                   X-Envelope-From
140                   From
141
142           In addition, the "envelope sender" data, taken from the SMTP
143           envelope data where this is available, is looked up.  See
144           "envelope_sender_header".
145
146           e.g.
147
148             whitelist_from joe@example.com fred@example.com
149             whitelist_from *@example.com
150
151       unwhitelist_from user@example.com
152           Used to override a default whitelist_from entry, so for example a
153           distribution whitelist_from can be overridden in a local.cf file,
154           or an individual user can override a whitelist_from entry in their
155           own "user_prefs" file.  The specified email address has to match
156           exactly (although case-insensitively) the address previously used
157           in a whitelist_from line, which implies that a wildcard only
158           matches literally the same wildcard (not 'any' address).
159
160           e.g.
161
162             unwhitelist_from joe@example.com fred@example.com
163             unwhitelist_from *@example.com
164
165       whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
166           Works similarly to whitelist_from, except that in addition to
167           matching a sender address, a relay's rDNS name or its IP address
168           must match too for the whitelisting rule to fire. The first
169           parameter is a sender's e-mail address to whitelist, and the second
170           is a string to match the relay's rDNS, or its IP address. Matching
171           is case-insensitive.
172
173           This second parameter is matched against a TCP-info information
174           field as provided in a FROM clause of a trace information (i.e. in
175           a Received header field, see RFC 5321). Only the Received header
176           fields inserted by trusted hosts are considered. This parameter can
177           either be a full hostname, or a domain component of that hostname,
178           or an IP address (optionally followed by a slash and a prefix
179           length) in square brackets. The address prefix (mask) length with a
180           slash may stand within brackets along with an address, or may
181           follow the bracketed address. Reverse DNS lookup is done by an MTA,
182           not by SpamAssassin.
183
184           For backward compatibility as an alternative to a CIDR notation, an
185           IPv4 address in brackets may be truncated on classful boundaries to
186           cover whole subnets, e.g. "[10.1.2.3]", "[10.1.2]", "[10.1]",
187           "[10]".
188
189           In other words, if the host that connected to your MX had an IP
190           address 192.0.2.123 that mapped to 'sendinghost.example.org', you
191           should specify "sendinghost.example.org", or "example.org", or
192           "[192.0.2.123]", or "[192.0.2.0/24]", or "[192.0.2]" here.
193
194           Note that this requires that "internal_networks" be correct.  For
195           simple cases, it will be, but for a complex network you may get
196           better results by setting that parameter.
197
198           It also requires that your mail exchangers be configured to perform
199           DNS reverse lookups on the connecting host's IP address, and to
200           record the result in the generated Received header field according
201           to RFC 5321.
202
203           e.g.
204
205             whitelist_from_rcvd joe@example.com  example.com
206             whitelist_from_rcvd *@*              mail.example.org
207             whitelist_from_rcvd *@axkit.org      [192.0.2.123]
208             whitelist_from_rcvd *@axkit.org      [192.0.2.0/24]
209             whitelist_from_rcvd *@axkit.org      [192.0.2.0]/24
210             whitelist_from_rcvd *@axkit.org      [2001:db8:1234::/48]
211             whitelist_from_rcvd *@axkit.org      [2001:db8:1234::]/48
212
213       def_whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
214           Same as "whitelist_from_rcvd", but used for the default whitelist
215           entries in the SpamAssassin distribution.  The whitelist score is
216           lower, because these are often targets for spammer spoofing.
217
218       whitelist_allows_relays user@example.com
219           Specify addresses which are in "whitelist_from_rcvd" that sometimes
220           send through a mail relay other than the listed ones. By default
221           mail with a From address that is in "whitelist_from_rcvd" that does
222           not match the relay will trigger a forgery rule. Including the
223           address in "whitelist_allows_relay" prevents that.
224
225           Whitelist and blacklist addresses are now file-glob-style patterns,
226           so "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all
227           work.  Specifically, "*" and "?" are allowed, but all other
228           metacharacters are not. Regular expressions are not used for
229           security reasons.  Matching is case-insensitive.
230
231           Multiple addresses per line, separated by spaces, is OK.  Multiple
232           "whitelist_allows_relays" lines are also OK.
233
234           The specified email address does not have to match exactly the
235           address previously used in a whitelist_from_rcvd line as it is
236           compared to the address in the header.
237
238           e.g.
239
240             whitelist_allows_relays joe@example.com fred@example.com
241             whitelist_allows_relays *@example.com
242
243       unwhitelist_from_rcvd user@example.com
244           Used to override a default whitelist_from_rcvd entry, so for
245           example a distribution whitelist_from_rcvd can be overridden in a
246           local.cf file, or an individual user can override a
247           whitelist_from_rcvd entry in their own "user_prefs" file.
248
249           The specified email address has to match exactly the address
250           previously used in a whitelist_from_rcvd line.
251
252           e.g.
253
254             unwhitelist_from_rcvd joe@example.com fred@example.com
255             unwhitelist_from_rcvd *@axkit.org
256
257       blacklist_from user@example.com
258           Used to specify addresses which send mail that is often tagged
259           (incorrectly) as non-spam, but which the user doesn't want.  Same
260           format as "whitelist_from".
261
262       unblacklist_from user@example.com
263           Used to override a default blacklist_from entry, so for example a
264           distribution blacklist_from can be overridden in a local.cf file,
265           or an individual user can override a blacklist_from entry in their
266           own "user_prefs" file. The specified email address has to match
267           exactly the address previously used in a blacklist_from line.
268
269           e.g.
270
271             unblacklist_from joe@example.com fred@example.com
272             unblacklist_from *@spammer.com
273
274       whitelist_to user@example.com
275           If the given address appears as a recipient in the message headers
276           (Resent-To, To, Cc, obvious envelope recipient, etc.) the mail will
277           be whitelisted.  Useful if you're deploying SpamAssassin system-
278           wide, and don't want some users to have their mail filtered.  Same
279           format as "whitelist_from".
280
281           There are three levels of To-whitelisting, "whitelist_to",
282           "more_spam_to" and "all_spam_to".  Users in the first level may
283           still get some spammish mails blocked, but users in "all_spam_to"
284           should never get mail blocked.
285
286           The headers checked for whitelist addresses are as follows: if
287           "Resent-To" or "Resent-Cc" are set, use those; otherwise check all
288           addresses taken from the following set of headers:
289
290                   To
291                   Cc
292                   Apparently-To
293                   Delivered-To
294                   Envelope-Recipients
295                   Apparently-Resent-To
296                   X-Envelope-To
297                   Envelope-To
298                   X-Delivered-To
299                   X-Original-To
300                   X-Rcpt-To
301                   X-Real-To
302
303       more_spam_to user@example.com
304           See above.
305
306       all_spam_to user@example.com
307           See above.
308
309       blacklist_to user@example.com
310           If the given address appears as a recipient in the message headers
311           (Resent-To, To, Cc, obvious envelope recipient, etc.) the mail will
312           be blacklisted.  Same format as "blacklist_from".
313
314       whitelist_auth user@example.com
315           Used to specify addresses which send mail that is often tagged
316           (incorrectly) as spam.  This is different from "whitelist_from" and
317           "whitelist_from_rcvd" in that it first verifies that the message
318           was sent by an authorized sender for the address, before
319           whitelisting.
320
321           Authorization is performed using one of the installed sender-
322           authorization schemes: SPF (using
323           "Mail::SpamAssassin::Plugin::SPF"), or DKIM (using
324           "Mail::SpamAssassin::Plugin::DKIM").  Note that those plugins must
325           be active, and working, for this to operate.
326
327           Using "whitelist_auth" is roughly equivalent to specifying
328           duplicate "whitelist_from_spf", "whitelist_from_dk", and
329           "whitelist_from_dkim" lines for each of the addresses specified.
330
331           e.g.
332
333             whitelist_auth joe@example.com fred@example.com
334             whitelist_auth *@example.com
335
336       def_whitelist_auth user@example.com
337           Same as "whitelist_auth", but used for the default whitelist
338           entries in the SpamAssassin distribution.  The whitelist score is
339           lower, because these are often targets for spammer spoofing.
340
341       unwhitelist_auth user@example.com
342           Used to override a "whitelist_auth" entry. The specified email
343           address has to match exactly the address previously used in a
344           "whitelist_auth" line.
345
346           e.g.
347
348             unwhitelist_auth joe@example.com fred@example.com
349             unwhitelist_auth *@example.com
350
351       enlist_uri_host (listname) host ...
352           Adds one or more host names or domain names to a named list of URI
353           domains.  The named list can then be consulted through a
354           check_uri_host_listed() eval rule implemented by the WLBLEval
355           plugin, which takes the list name as an argument. Parenthesis
356           around a list name are literal - a required syntax.
357
358           Host names may optionally be prefixed by an exclamation mark '!',
359           which produces false as a result if this entry matches. This makes
360           it easier to exclude some subdomains when their superdomain is
361           listed, for example:
362
363             enlist_uri_host (MYLIST) !sub1.example.com !sub2.example.com example.com
364
365           No wildcards are supported, but subdomains do match implicitly.
366           Lists are independent. Search for each named list starts by looking
367           up the full hostname first, then leading fields are progressively
368           stripped off (e.g.: sub.example.com, example.com, com) until a
369           match is found or we run out of fields. The first matching entry
370           (the most specific) determines if a lookup yielded a true (no '!'
371           prefix) or a false (with a '!' prefix) result.
372
373           If an URL found in a message contains an IP address in place of a
374           host name, the given list must specify the exact same IP address
375           (instead of a host name) in order to match.
376
377           Use the delist_uri_host directive to neutralize previous
378           enlist_uri_host settings.
379
380           Enlisting to lists named 'BLACK' and 'WHITE' have their shorthand
381           directives blacklist_uri_host and whitelist_uri_host and
382           corresponding default rules, but the names 'BLACK' and 'WHITE' are
383           otherwise not special or reserved.
384
385       delist_uri_host [ (listname) ] host ...
386           Removes one or more specified host names from a named list of URI
387           domains.  Removing an unlisted name is ignored (is not an error).
388           Listname is optional, if specified then just the named list is
389           affected, otherwise hosts are removed from all URI host lists
390           created so far. Parenthesis around a list name are a required
391           syntax.
392
393           Note that directives in configuration files are processed in
394           sequence, the delist_uri_host only applies to previously listed
395           entries and has no effect on enlisted entries in yet-to-be-
396           processed directives.
397
398           For convenience (similarity to the enlist_uri_host directive)
399           hostnames may be prefixed by a an exclamation mark, which is
400           stripped off from each name and has no meaning here.
401
402       enlist_addrlist (listname) user@example.com
403           Adds one or more addresses to a named list of addresses.  The named
404           list can then be consulted through a check_from_in_list() or a
405           check_to_in_list() eval rule implemented by the WLBLEval plugin,
406           which takes the list name as an argument. Parenthesis around a list
407           name are literal - a required syntax.
408
409           Listed addresses are file-glob-style patterns, so
410           "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all
411           work.  Specifically, "*" and "?" are allowed, but all other
412           metacharacters are not. Regular expressions are not used for
413           security reasons.  Matching is case-insensitive.
414
415           Multiple addresses per line, separated by spaces, is OK.  Multiple
416           "enlist_addrlist" lines are also OK.
417
418           Enlisting an address to the list named blacklist_to is synonymous
419           to using the directive blacklist_to
420
421           Enlisting an address to the list named blacklist_from is synonymous
422           to using the directive blacklist_from
423
424           Enlisting an address to the list named whitelist_to is synonymous
425           to using the directive whitelist_to
426
427           Enlisting an address to the list named whitelist_from is synonymous
428           to using the directive whitelist_from
429
430           e.g.
431
432             enlist_addrlist (PAYPAL_ADDRESS) service@paypal.com
433             enlist_addrlist (PAYPAL_ADDRESS) *@paypal.co.uk
434
435       blacklist_uri_host host-or-domain ...
436           Is a shorthand for a directive:  enlist_uri_host (BLACK) host ...
437
438           Please see directives enlist_uri_host and delist_uri_host for
439           details.
440
441       whitelist_uri_host host-or-domain ...
442           Is a shorthand for a directive:  enlist_uri_host (BLACK) host ...
443
444           Please see directives enlist_uri_host and delist_uri_host for
445           details.
446
447   BASIC MESSAGE TAGGING OPTIONS
448       rewrite_header { subject | from | to } STRING
449           By default, suspected spam messages will not have the "Subject",
450           "From" or "To" lines tagged to indicate spam. By setting this
451           option, the header will be tagged with "STRING" to indicate that a
452           message is spam. For the From or To headers, this will take the
453           form of an RFC 2822 comment following the address in parentheses.
454           For the Subject header, this will be prepended to the original
455           subject. Note that you should only use the _REQD_ and _SCORE_ tags
456           when rewriting the Subject header if "report_safe" is 0. Otherwise,
457           you may not be able to remove the SpamAssassin markup via the
458           normal methods.  More information about tags is explained below in
459           the TEMPLATE TAGS section.
460
461           Parentheses are not permitted in STRING if rewriting the From or To
462           headers.  (They will be converted to square brackets.)
463
464           If "rewrite_header subject" is used, but the message being
465           rewritten does not already contain a "Subject" header, one will be
466           created.
467
468           A null value for "STRING" will remove any existing rewrite for the
469           specified header.
470
471       subjprefix
472           Add a prefix in emails Subject if a rule is matched.  To enable
473           this option "rewrite_header Subject" config option must be enabled
474           as well.
475
476           The check "if can(Mail::SpamAssassin::Conf::feature_subjprefix)"
477           should be used to silence warnings in previous SpamAssassin
478           versions.
479
480           To be able to use this feature a "add_header all Subjprefix
481           _SUBJPREFIX_" configuration line could be needed on some setups.
482
483       add_header { spam | ham | all } header_name string
484           Customized headers can be added to the specified type of messages
485           (spam, ham, or "all" to add to either).  All headers begin with
486           "X-Spam-" (so a "header_name" Foo will generate a header called
487           X-Spam-Foo).  header_name is restricted to the character set
488           [A-Za-z0-9_-].
489
490           The order of "add_header" configuration options is preserved,
491           inserted headers will follow this order of declarations. When
492           combining "add_header" with "clear_headers" and "remove_header",
493           keep in mind that "add_header" appends a new header to the current
494           list, after first removing any existing header fields of the same
495           name. Note also that "add_header", "clear_headers" and
496           "remove_header" may appear in multiple .cf files, which are
497           interpreted in alphabetic order.
498
499           "string" can contain tags as explained below in the TEMPLATE TAGS
500           section.  You can also use "\n" and "\t" in the header to add
501           newlines and tabulators as desired.  A backslash has to be written
502           as \\, any other escaped chars will be silently removed.
503
504           All headers will be folded if fold_headers is set to 1. Note:
505           Manually adding newlines via "\n" disables any further automatic
506           wrapping (ie: long header lines are possible). The lines will still
507           be properly folded (marked as continuing) though.
508
509           You can customize existing headers with add_header (only the
510           specified subset of messages will be changed).
511
512           See also "clear_headers" and "remove_header" for removing headers.
513
514           Here are some examples (these are the defaults, note that Checker-
515           Version can not be changed or removed):
516
517             add_header spam Flag _YESNOCAPS_
518             add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
519             add_header all Level _STARS(*)_
520             add_header all Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on _HOSTNAME_
521
522       remove_header { spam | ham | all } header_name
523           Headers can be removed from the specified type of messages (spam,
524           ham, or "all" to remove from either).  All headers begin with
525           "X-Spam-" (so "header_name" will be appended to "X-Spam-").
526
527           See also "clear_headers" for removing all the headers at once.
528
529           Note that X-Spam-Checker-Version is not removable because the
530           version information is needed by mail administrators and developers
531           to debug problems.  Without at least one header, it might not even
532           be possible to determine that SpamAssassin is running.
533
534       clear_headers
535           Clear the list of headers to be added to messages.  You may use
536           this before any add_header options to prevent the default headers
537           from being added to the message.
538
539           "add_header", "clear_headers" and "remove_header" may appear in
540           multiple .cf files, which are interpreted in alphabetic order, so
541           "clear_headers" in a later file will remove all added headers from
542           previously interpreted configuration files, which may or may not be
543           desired.
544
545           Note that X-Spam-Checker-Version is not removable because the
546           version information is needed by mail administrators and developers
547           to debug problems.  Without at least one header, it might not even
548           be possible to determine that SpamAssassin is running.
549
550       report_safe ( 0 | 1 | 2 )     (default: 1)
551           if this option is set to 1, if an incoming message is tagged as
552           spam, instead of modifying the original message, SpamAssassin will
553           create a new report message and attach the original message as a
554           message/rfc822 MIME part (ensuring the original message is
555           completely preserved, not easily opened, and easier to recover).
556
557           If this option is set to 2, then original messages will be attached
558           with a content type of text/plain instead of message/rfc822.  This
559           setting may be required for safety reasons on certain broken mail
560           clients that automatically load attachments without any action by
561           the user.  This setting may also make it somewhat more difficult to
562           extract or view the original message.
563
564           If this option is set to 0, incoming spam is only modified by
565           adding some "X-Spam-" headers and no changes will be made to the
566           body.  In addition, a header named X-Spam-Report will be added to
567           spam.  You can use the remove_header option to remove that header
568           after setting report_safe to 0.
569
570           See report_safe_copy_headers if you want to copy headers from the
571           original mail into tagged messages.
572
573       report_wrap_width (default: 70)
574           This option sets the wrap width for description lines in the
575           X-Spam-Report header, not accounting for tab width.
576
577   LANGUAGE OPTIONS
578       ok_locales xx [ yy zz ... ]        (default: all)
579           This option is used to specify which locales are considered OK for
580           incoming mail.  Mail using the character sets that are allowed by
581           this option will not be marked as possibly being spam in a foreign
582           language.
583
584           If you receive lots of spam in foreign languages, and never get any
585           non-spam in these languages, this may help.  Note that all
586           ISO-8859-* character sets, and Windows code page character sets,
587           are always permitted by default.
588
589           Set this to "all" to allow all character sets.  This is the
590           default.
591
592           The rules "CHARSET_FARAWAY", "CHARSET_FARAWAY_BODY", and
593           "CHARSET_FARAWAY_HEADERS" are triggered based on how this is set.
594
595           Examples:
596
597             ok_locales all         (allow all locales)
598             ok_locales en          (only allow English)
599             ok_locales en ja zh    (allow English, Japanese, and Chinese)
600
601           Note: if there are multiple ok_locales lines, only the last one is
602           used.
603
604           Select the locales to allow from the list below:
605
606           en   - Western character sets in general
607           ja   - Japanese character sets
608           ko   - Korean character sets
609           ru   - Cyrillic character sets
610           th   - Thai character sets
611           zh   - Chinese (both simplified and traditional) character sets
612       normalize_charset ( 0 | 1)        (default: 0)
613           Whether to decode non- UTF-8 and non-ASCII textual parts and recode
614           them to UTF-8 before the text is given over to rules processing.
615           The character set used for attempted decoding is primarily based on
616           a declared character set in a Content-Type header, but if the
617           decoding attempt fails a module Encode::Detect::Detector is
618           consulted (if available) to provide a guess based on the actual
619           text, and decoding is re-attempted. Even if the option is enabled
620           no unnecessary decoding and re-encoding work is done when possible
621           (like with an all-ASCII text with a US-ASCII or extended ASCII
622           character set declaration, e.g. UTF-8 or ISO-8859-nn or Windows-
623           nnnn).
624
625           Unicode support in old versions of perl or in a core module Encode
626           is likely to be buggy in places, so if the normalize_charset
627           function is enabled it is advised to stick to more recent versions
628           of perl (preferably 5.12 or later). The module
629           Encode::Detect::Detector is optional, when necessary it will be
630           used if it is available.
631
632       body_part_scan_size               (default: 50000)
633           Per mime-part scan size limit in bytes for "body" type rules.  The
634           decoded/stripped mime-part is truncated approx to this size.  Helps
635           scanning large messages safely, so it's not necessary to skip them
636           completely. Disabled with 0.
637
638       rawbody_part_scan_size               (default: 500000)
639           Like body_part_scan_size, for "rawbody" type rules.
640
641   NETWORK TEST OPTIONS
642       trusted_networks IPaddress[/masklen] ...   (default: none)
643           What networks or hosts are 'trusted' in your setup.  Trusted in
644           this case means that relay hosts on these networks are considered
645           to not be potentially operated by spammers, open relays, or open
646           proxies.  A trusted host could conceivably relay spam, but will not
647           originate it, and will not forge header data. DNS blacklist checks
648           will never query for hosts on these networks.
649
650           See "http://wiki.apache.org/spamassassin/TrustPath" for more
651           information.
652
653           MXes for your domain(s) and internal relays should also be
654           specified using the "internal_networks" setting. When there are
655           'trusted' hosts that are not MXes or internal relays for your
656           domain(s) they should only be specified in "trusted_networks".
657
658           The "IPaddress" can be an IPv4 address (in a dot-quad form), or an
659           IPv6 address optionally enclosed in square brackets. Scoped link-
660           local IPv6 addresses are syntactically recognized but the interface
661           scope is currently ignored (e.g. [fe80::1234%eth0] ) and should be
662           avoided.
663
664           If a "/masklen" is specified, it is considered a CIDR-style
665           'netmask' length, specified in bits.  If it is not specified, but
666           less than 4 octets of an IPv4 address are specified with a trailing
667           dot, an implied netmask length covers all addresses in remaining
668           octets (i.e. implied masklen is /8 or /16 or /24).  If masklen is
669           not specified, and there is not trailing dot, then just a single IP
670           address specified is used, as if the masklen were "/32" with an
671           IPv4 address, or "/128" in case of an IPv6 address.
672
673           If a network or host address is prefaced by a "!" the matching
674           network or host will be excluded from the list even if a less
675           specific (shorter netmask length) subnet is later specified in the
676           list. This allows a subset of a wider network to be exempt. In case
677           of specifying overlapping subnets, specify more specific subnets
678           first (tighter matching, i.e. with a longer netmask length),
679           followed by less specific (shorter netmask length) subnets to get
680           predictable results regardless of the search algorithm used - when
681           Net::Patricia module is installed the search finds the tightest
682           matching entry in the list, while a sequential search as used in
683           absence of the module Net::Patricia will find the first matching
684           entry in the list.
685
686           Note: 127.0.0.0/8 and ::1 are always included in trusted_networks,
687           regardless of your config.
688
689           Examples:
690
691              trusted_networks 192.168.0.0/16        # all in 192.168.*.*
692              trusted_networks 192.168.              # all in 192.168.*.*
693              trusted_networks 212.17.35.15          # just that host
694              trusted_networks !10.0.1.5 10.0.1/24   # all in 10.0.1.* but not 10.0.1.5
695              trusted_networks 2001:db8:1::1 !2001:db8:1::/64 2001:db8::/32
696                # 2001:db8::/32 and 2001:db8:1::1/128, except the rest of 2001:db8:1::/64
697
698           This operates additively, so a "trusted_networks" line after
699           another one will append new entries to the list of trusted
700           networks.  To clear out the existing entries, use
701           "clear_trusted_networks".
702
703           If "trusted_networks" is not set and "internal_networks" is, the
704           value of "internal_networks" will be used for this parameter.
705
706           If neither "trusted_networks" or "internal_networks" is set, a
707           basic inference algorithm is applied.  This works as follows:
708
709           ·   If the 'from' host has an IP address in a private (RFC 1918)
710               network range, then it's trusted
711
712           ·   If there are authentication tokens in the received header, and
713               the previous host was trusted, then this host is also trusted
714
715           ·   Otherwise this host, and all further hosts, are consider
716               untrusted.
717
718       clear_trusted_networks
719           Empty the list of trusted networks.
720
721       internal_networks IPaddress[/masklen] ...   (default: none)
722           What networks or hosts are 'internal' in your setup.   Internal
723           means that relay hosts on these networks are considered to be MXes
724           for your domain(s), or internal relays.  This uses the same syntax
725           as "trusted_networks", above - see there for details.
726
727           This value is used when checking 'dial-up' or dynamic IP address
728           blocklists, in order to detect direct-to-MX spamming.
729
730           Trusted relays that accept mail directly from dial-up connections
731           (i.e. are also performing a role of mail submission agents - MSA)
732           should not be listed in "internal_networks". List them only in
733           "trusted_networks".
734
735           If "trusted_networks" is set and "internal_networks" is not, the
736           value of "trusted_networks" will be used for this parameter.
737
738           If neither "trusted_networks" nor "internal_networks" is set, no
739           addresses will be considered local; in other words, any relays past
740           the machine where SpamAssassin is running will be considered
741           external.
742
743           Every entry in "internal_networks" must appear in
744           "trusted_networks"; in other words, "internal_networks" is always a
745           subset of the trusted set.
746
747           Note: 127/8 and ::1 are always included in internal_networks,
748           regardless of your config.
749
750       clear_internal_networks
751           Empty the list of internal networks.
752
753       msa_networks IPaddress[/masklen] ...   (default: none)
754           The networks or hosts which are acting as MSAs in your setup (but
755           not also as MX relays). This uses the same syntax as
756           "trusted_networks", above - see there for details.
757
758           MSA means that the relay hosts on these networks accept mail from
759           your own users and authenticates them appropriately.  These relays
760           will never accept mail from hosts that aren't authenticated in some
761           way. Examples of authentication include, IP lists, SMTP AUTH, POP-
762           before-SMTP, etc.
763
764           All relays found in the message headers after the MSA relay will
765           take on the same trusted and internal classifications as the MSA
766           relay itself, as defined by your trusted_networks and
767           internal_networks configuration.
768
769           For example, if the MSA relay is trusted and internal so will all
770           of the relays that precede it.
771
772           When using msa_networks to identify an MSA it is recommended that
773           you treat that MSA as both trusted and internal.  When an MSA is
774           not included in msa_networks you should treat the MSA as trusted
775           but not internal, however if the MSA is also acting as an MX or
776           intermediate relay you must always treat it as both trusted and
777           internal and ensure that the MSA includes visible auth tokens in
778           its Received header to identify submission clients.
779
780           Warning: Never include an MSA that also acts as an MX (or is also
781           an intermediate relay for an MX) or otherwise accepts mail from
782           non-authenticated users in msa_networks.  Doing so will result in
783           unknown external relays being trusted.
784
785       clear_msa_networks
786           Empty the list of msa networks.
787
788       originating_ip_headers header ...   (default: X-Yahoo-Post-IP
789       X-Originating-IP X-Apparently-From X-SenderIP)
790           A list of header field names from which an originating IP address
791           can be obtained. For example, webmail servers may record a client
792           IP address in X-Originating-IP.
793
794           These IP addresses are virtually appended into the Received: chain,
795           so they are used in RBL checks where appropriate.
796
797           Currently the IP addresses are not added into X-Spam-Relays-*
798           header fields, but they may be in the future.
799
800       clear_originating_ip_headers
801           Empty the list of 'originating IP address' header field names.
802
803       always_trust_envelope_sender ( 0 | 1 )   (default: 0)
804           Trust the envelope sender even if the message has been passed
805           through one or more trusted relays.  See also
806           "envelope_sender_header".
807
808       skip_rbl_checks ( 0 | 1 )   (default: 0)
809           Turning on the skip_rbl_checks setting will disable the DNSEval
810           plugin, which implements Real-time Block List (or: Blackhole List)
811           (RBL) lookups.
812
813           By default, SpamAssassin will run RBL checks. Individual blocklists
814           may be disabled selectively by setting a score of a corresponding
815           rule to 0.
816
817           See also a related configuration parameter skip_uribl_checks, which
818           controls the URIDNSBL plugin (documented in the URIDNSBL man page).
819
820       dns_available { yes | no | test[: domain1 domain2...] }   (default:
821       yes)
822           Tells SpamAssassin whether DNS resolving is available or not. A
823           value yes indicates DNS resolving is available, a value no
824           indicates DNS resolving is not available - both of these values
825           apply unconditionally and skip initial DNS tests, which can be slow
826           or unreliable.
827
828           When the option value is a test (with or without arguments),
829           SpamAssassin will query some domain names on the internet during
830           initialization, attempting to determine if DNS resolving is working
831           or not. A space-separated list of domain names may be specified
832           explicitly, or left to a built-in default of a dozen or so domain
833           names. From an explicit or a default list a subset of three domain
834           names is picked randomly for checking. The test queries for NS
835           records of these domain: if at least one query returns a success
836           then SpamAssassin considers DNS resolving as available, otherwise
837           not.
838
839           The problem is that the test can introduce some startup delay if a
840           network connection is down, and in some cases it can wrongly guess
841           that DNS is unavailable because a test connection failed, what
842           causes disabling several DNS-dependent tests.
843
844           Please note, the DNS test queries for NS records, so specify domain
845           names, not host names.
846
847           Since version 3.4.0 of SpamAssassin a default setting for option
848           dns_available is yes. A default in older versions was test.
849
850       dns_server ip-addr-port  (default: entries provided by Net::DNS)
851           Specifies an IP address of a DNS server, and optionally its port
852           number.  The dns_server directive may be specified multiple times,
853           each entry adding to a list of available resolving name servers.
854           The ip-addr-port argument can either be an IPv4 or IPv6 address,
855           optionally enclosed in brackets, and optionally followed by a colon
856           and a port number. In absence of a port number a standard port
857           number 53 is assumed. When an IPv6 address is specified along with
858           a port number, the address must be enclosed in brackets to avoid
859           parsing ambiguity regarding a colon separator. A scoped link-local
860           IP address is allowed (assuming underlying modules allow it).
861
862           Examples :
863            dns_server 127.0.0.1
864            dns_server 127.0.0.1:53
865            dns_server [127.0.0.1]:53
866            dns_server [::1]:53
867            dns_server fe80::1%lo0
868            dns_server [fe80::1%lo0]:53
869
870           In absence of dns_server directives, the list of name servers is
871           provided by Net::DNS module, which typically obtains the list from
872           /etc/resolv.conf, but this may be platform dependent. Please
873           consult the Net::DNS::Resolver documentation for details.
874
875       clear_dns_servers
876           Empty the list of explicitly configured DNS servers through a
877           dns_server directive, falling back to Net::DNS -supplied defaults.
878
879       dns_local_ports_permit ranges...
880           Add the specified ports or ports ranges to the set of allowed port
881           numbers that can be used as local port numbers when sending DNS
882           queries to a resolver.
883
884           The argument is a whitespace-separated or a comma-separated list of
885           single port numbers n, or port number pairs (i.e. m-n) delimited by
886           a '-', representing a range. Allowed port numbers are between 1 and
887           65535.
888
889           Directives dns_local_ports_permit and dns_local_ports_avoid are
890           processed in order in which they appear in configuration files.
891           Each directive adds (or subtracts) its subsets of ports to a
892           current set of available ports.  Whatever is left in the set by the
893           end of configuration processing is made available to a DNS
894           resolving client code.
895
896           If the resulting set of port numbers is empty (see also the
897           directive dns_local_ports_none), then SpamAssassin does not apply
898           its ports randomization logic, but instead leaves the operating
899           system to choose a suitable free local port number.
900
901           The initial set consists of all port numbers in the range
902           1024-65535.  Note that system config files already modify the set
903           and remove all the IANA registered port numbers and some other
904           ranges, so there is rarely a need to adjust the ranges by site-
905           specific directives.
906
907           See also directives dns_local_ports_permit and
908           dns_local_ports_none.
909
910       dns_local_ports_avoid ranges...
911           Remove specified ports or ports ranges from the set of allowed port
912           numbers that can be used as local port numbers when sending DNS
913           queries to a resolver.
914
915           Please see directive dns_local_ports_permit for details.
916
917       dns_local_ports_none
918           Is a fast shorthand for:
919
920             dns_local_ports_avoid 1-65535
921
922           leaving the set of available DNS query local port numbers empty. In
923           all respects (apart from speed) it is equivalent to the shown
924           directive, and can be freely mixed with dns_local_ports_permit and
925           dns_local_ports_avoid.
926
927           If the resulting set of port numbers is empty, then SpamAssassin
928           does not apply its ports randomization logic, but instead leaves
929           the operating system to choose a suitable free local port number.
930
931           See also directives dns_local_ports_permit and
932           dns_local_ports_avoid.
933
934       dns_test_interval n   (default: 600 seconds)
935           If dns_available is set to test, the dns_test_interval time in
936           number of seconds will tell SpamAssassin how often to retest for
937           working DNS.  A numeric value is optionally suffixed by a time unit
938           (s, m, h, d, w, indicating seconds (default), minutes, hours, days,
939           weeks).
940
941       dns_options opts   (default: norotate, nodns0x20, edns=4096)
942           Provides a (whitespace or comma -separated) list of options
943           applying to DNS resolving. Available options are: rotate, dns0x20
944           and edns (or edns0). Option name may be negated by prepending a no
945           (e.g. norotate, NoEDNS) to counteract a previously enabled option.
946           Option names are not case-sensitive. The dns_options directive may
947           appear in configuration files multiple times, the last setting
948           prevails.
949
950           Option edns (or edsn0) may take a value which specifies a
951           requestor's acceptable UDP payload size according to EDNS0
952           specifications (RFC 6891, ex RFC 2671) e.g. edns=4096. When EDNS0
953           is off (noedns or edns=512) a traditional implied UDP payload size
954           is 512 bytes, which is also a minimum allowed value for this
955           option. When the option is specified but a value is not provided, a
956           conservative default of 1220 bytes is implied. It is recommended to
957           keep edns enabled when using a local recursive DNS server which
958           supports EDNS0 (like most modern DNS servers do), a suitable
959           setting in this case is edns=4096, which is also a default.
960           Allowing UDP payload size larger than 512 bytes can avoid
961           truncation of resource records in large DNS responses (like in TXT
962           records of some SPF and DKIM responses, or when an unreasonable
963           number of A records is published by some domain). The option should
964           be disabled when a recursive DNS server is only reachable through
965           non- RFC 6891 compliant middleboxes (such as some old-fashioned
966           firewall) which bans DNS UDP payload sizes larger than 512 bytes. A
967           suitable value when a non-local recursive DNS server is used and a
968           middlebox does allow EDNS0 but blocks fragmented IP packets is
969           perhaps 1220 bytes, allowing a DNS UDP packet to fit within a
970           single IP packet in most cases (a slightly less conservative range
971           would be 1280-1410 bytes).
972
973           Option rotate causes SpamAssassin to choose a DNS server at random
974           from all servers listed in "/etc/resolv.conf" every
975           dns_test_interval seconds, effectively spreading the load over all
976           currently available DNS servers when there are many spamd workers.
977
978           Option dns0x20 enables randomization of letters in a DNS query
979           label according to draft-vixie-dnsext-dns0x20, decreasing a chance
980           of collisions of responses (by chance or by a malicious intent) by
981           increasing spread as provided by a 16-bit query ID and up to 16
982           bits of a port number, with additional bits as encoded by flipping
983           case (upper/lower) of letters in a query. The number of additional
984           random bits corresponds to the number of letters in a query label.
985           Should work reliably with all mainstream DNS servers - do not turn
986           on if you see frequent info messages "dns: no callback for id:" in
987           the log, or if RBL or URIDNS lookups do not work for no apparent
988           reason.
989
990       dns_query_restriction (allow|deny) domain1 domain2 ...
991           Option allows disabling of rules which would result in a DNS query
992           to one of the listed domains. The first argument must be a literal
993           "allow" or "deny", remaining arguments are domains names.
994
995           Most DNS queries (with some exceptions) are subject to
996           dns_query_restriction.  A domain to be queried is successively
997           stripped-off of its leading labels (thus yielding a series of its
998           parent domains), and on each iteration a check is made against an
999           associative array generated by dns_query_restriction options.
1000           Search stops at the first match (i.e. the tightest match), and the
1001           matching entry with its "allow" or "deny" value then controls
1002           whether a DNS query is allowed to be launched.
1003
1004           If no match is found an implicit default is to allow a query. The
1005           purpose of an explicit "allow" entry is to be able to override a
1006           previously configured "deny" on the same domain or to override an
1007           entry (possibly yet to be configured in subsequent config
1008           directives) on one of its parent domains.  Thus an 'allow
1009           zen.spamhaus.org' with a 'deny spamhaus.org' would permit DNS
1010           queries on a specific DNS BL zone but deny queries to other zones
1011           under the same parent domain.
1012
1013           Domains are matched case-insensitively, no wildcards are
1014           recognized, there should be no leading or trailing dot.
1015
1016           Specifying a block on querying a domain name has a similar effect
1017           as setting a score of corresponding DNSBL and URIBL rules to zero,
1018           and can be a handy alternative to hunting for such rules when a
1019           site policy does not allow certain DNS block lists to be queried.
1020
1021           Example:
1022             dns_query_restriction deny  dnswl.org surbl.org
1023             dns_query_restriction allow zen.spamhaus.org
1024             dns_query_restriction deny  spamhaus.org mailspike.net
1025           spamcop.net
1026
1027       clear_dns_query_restriction
1028           The option removes any entries entered by previous
1029           'dns_query_restriction' options, leaving the list empty, i.e.
1030           allowing DNS queries for any domain (including any DNS BL zone).
1031
1032   LEARNING OPTIONS
1033       use_learner ( 0 | 1 )         (default: 1)
1034           Whether to use any machine-learning classifiers with SpamAssassin,
1035           such as the default 'BAYES_*' rules.  Setting this to 0 will
1036           disable use of any and all human-trained classifiers.
1037
1038       use_bayes ( 0 | 1 )      (default: 1)
1039           Whether to use the naive-Bayesian-style classifier built into
1040           SpamAssassin.  This is a master on/off switch for all Bayes-related
1041           operations.
1042
1043       use_bayes_rules ( 0 | 1 )          (default: 1)
1044           Whether to use rules using the naive-Bayesian-style classifier
1045           built into SpamAssassin.  This allows you to disable the rules
1046           while leaving auto and manual learning enabled.
1047
1048       bayes_auto_learn ( 0 | 1 )      (default: 1)
1049           Whether SpamAssassin should automatically feed high-scoring mails
1050           (or low-scoring mails, for non-spam) into its learning systems.
1051           The only learning system supported currently is a naive-Bayesian-
1052           style classifier.
1053
1054           See the documentation for the
1055           "Mail::SpamAssassin::Plugin::AutoLearnThreshold" plugin module for
1056           details on how Bayes auto-learning is implemented by default.
1057
1058       bayes_token_sources  (default: header visible invisible uri)
1059           Controls which sources in a mail message can contribute tokens
1060           (e.g. words, phrases, etc.) to a Bayes classifier. The argument is
1061           a space-separated list of keywords: header, visible, invisible,
1062           uri, mimepart), each of which may be prefixed by a no to indicate
1063           its exclusion. Additionally two reserved keywords are allowed: all
1064           and none (or: noall). The list of keywords is processed
1065           sequentially: a keyword all adds all available keywords to a set
1066           being built, a none or noall clears the set, other non-negated
1067           keywords are added to the set, and negated keywords are removed
1068           from the set. Keywords are case-insensitive.
1069
1070           The default set is: header visible invisible uri, which is
1071           equivalent for example to: All NoMIMEpart. The reason why mimepart
1072           is not currently in a default set is that it is a newer source
1073           (introduced with SpamAssassin version 3.4.1) and not much
1074           experience has yet been gathered regarding its usefulness.
1075
1076           See also option "bayes_ignore_header" for a fine-grained control on
1077           individual header fields under the umbrella of a more general
1078           keyword header here.
1079
1080           Keywords imply the following data sources:
1081
1082           header - tokens collected from a message header section
1083           visible - words from visible text (plain or HTML) in a message body
1084           invisible - hidden/invisible text in HTML parts of a message body
1085           uri - URIs collected from a message body
1086           mimepart - digests (hashes) of all MIME parts (textual or non-
1087           textual) of a message, computed after Base64 and quoted-printable
1088           decoding, suffixed by their Content-Type
1089           all - adds all the above keywords to the set being assembled
1090           none or noall - removes all keywords from the set
1091
1092           The "bayes_token_sources" directive may appear multiple times, its
1093           keywords are interpreted sequentially, adding or removing items
1094           from the final set as they appear in their order in
1095           "bayes_token_sources" directive(s).
1096
1097       bayes_ignore_header header_name
1098           If you receive mail filtered by upstream mail systems, like a spam-
1099           filtering ISP or mailing list, and that service adds new headers
1100           (as most of them do), these headers may provide inappropriate cues
1101           to the Bayesian classifier, allowing it to take a "short cut". To
1102           avoid this, list the headers using this setting.  Example:
1103
1104                   bayes_ignore_header X-Upstream-Spamfilter
1105                   bayes_ignore_header X-Upstream-SomethingElse
1106
1107       bayes_ignore_from user@example.com
1108           Bayesian classification and autolearning will not be performed on
1109           mail from the listed addresses.  Program "sa-learn" will also
1110           ignore the listed addresses if it is invoked using the
1111           "--use-ignores" option.  One or more addresses can be listed, see
1112           "whitelist_from".
1113
1114           Spam messages from certain senders may contain many words that
1115           frequently occur in ham.  For example, one might read messages from
1116           a preferred bookstore but also get unwanted spam messages from
1117           other bookstores.  If the unwanted messages are learned as spam
1118           then any messages discussing books, including the preferred
1119           bookstore and antiquarian messages would be in danger of being
1120           marked as spam.  The addresses of the annoying bookstores would be
1121           listed.  (Assuming they were halfway legitimate and didn't send you
1122           mail through myriad affiliates.)
1123
1124           Those who have pieces of spam in legitimate messages or otherwise
1125           receive ham messages containing potentially spammy words might fear
1126           that some spam messages might be in danger of being marked as ham.
1127           The addresses of the spam mailing lists, correspondents, etc.
1128           would be listed.
1129
1130       bayes_ignore_to user@example.com
1131           Bayesian classification and autolearning will not be performed on
1132           mail to the listed addresses.  See "bayes_ignore_from" for details.
1133
1134       bayes_min_ham_num             (Default: 200)
1135       bayes_min_spam_num       (Default: 200)
1136           To be accurate, the Bayes system does not activate until a certain
1137           number of ham (non-spam) and spam have been learned.  The default
1138           is 200 of each ham and spam, but you can tune these up or down with
1139           these two settings.
1140
1141       bayes_learn_during_report         (Default: 1)
1142           The Bayes system will, by default, learn any reported messages
1143           ("spamassassin -r") as spam.  If you do not want this to happen,
1144           set this option to 0.
1145
1146       bayes_sql_override_username
1147           Used by BayesStore::SQL storage implementation.
1148
1149           If this options is set the BayesStore::SQL module will override the
1150           set username with the value given.  This could be useful for
1151           implementing global or group bayes databases.
1152
1153       bayes_use_hapaxes        (default: 1)
1154           Should the Bayesian classifier use hapaxes (words/tokens that occur
1155           only once) when classifying?  This produces significantly better
1156           hit-rates.
1157
1158       bayes_journal_max_size        (default: 102400)
1159           SpamAssassin will opportunistically sync the journal and the
1160           database.  It will do so once a day, but will sync more often if
1161           the journal file size goes above this setting, in bytes.  If set to
1162           0, opportunistic syncing will not occur.
1163
1164       bayes_expiry_max_db_size      (default: 150000)
1165           What should be the maximum size of the Bayes tokens database?  When
1166           expiry occurs, the Bayes system will keep either 75% of the maximum
1167           value, or 100,000 tokens, whichever has a larger value.  150,000
1168           tokens is roughly equivalent to a 8Mb database file.
1169
1170       bayes_auto_expire             (default: 1)
1171           If enabled, the Bayes system will try to automatically expire old
1172           tokens from the database.  Auto-expiry occurs when the number of
1173           tokens in the database surpasses the bayes_expiry_max_db_size
1174           value. If a bayes datastore backend does not implement individual
1175           key/value expirations, the setting is silently ignored.
1176
1177       bayes_token_ttl               (default: 3w, i.e. 3 weeks)
1178           Time-to-live / expiration time in seconds for tokens kept in a
1179           Bayes database.  A numeric value is optionally suffixed by a time
1180           unit (s, m, h, d, w, indicating seconds (default), minutes, hours,
1181           days, weeks).
1182
1183           If bayes_auto_expire is true and a Bayes datastore backend supports
1184           it (currently only Redis), this setting controls deletion of
1185           expired tokens from a bayes database. The value is observed on a
1186           best-effort basis, exact timing promises are not necessarily kept.
1187           If a bayes datastore backend does not implement individual
1188           key/value expirations, the setting is silently ignored.
1189
1190       bayes_seen_ttl                (default: 8d, i.e. 8 days)
1191           Time-to-live / expiration time in seconds for 'seen' entries (i.e.
1192           mail message digests with their status) kept in a Bayes database.
1193           A numeric value is optionally suffixed by a time unit (s, m, h, d,
1194           w, indicating seconds (default), minutes, hours, days, weeks).
1195
1196           If bayes_auto_expire is true and a Bayes datastore backend supports
1197           it (currently only Redis), this setting controls deletion of
1198           expired 'seen' entries from a bayes database. The value is observed
1199           on a best-effort basis, exact timing promises are not necessarily
1200           kept. If a bayes datastore backend does not implement individual
1201           key/value expirations, the setting is silently ignored.
1202
1203       bayes_learn_to_journal   (default: 0)
1204           If this option is set, whenever SpamAssassin does Bayes learning,
1205           it will put the information into the journal instead of directly
1206           into the database.  This lowers contention for locking the database
1207           to execute an update, but will also cause more access to the
1208           journal and cause a delay before the updates are actually committed
1209           to the Bayes database.
1210
1211   MISCELLANEOUS OPTIONS
1212       time_limit n   (default: 300)
1213           Specifies a limit on elapsed time in seconds that SpamAssassin is
1214           allowed to spend before providing a result. The value may be
1215           fractional and must not be negative, zero is interpreted as
1216           unlimited. The default is 300 seconds for consistency with the
1217           spamd default setting of --timeout-child .
1218
1219           This is a best-effort advisory setting, processing will not be
1220           abruptly aborted at an arbitrary point in processing when the time
1221           limit is exceeded, but only on reaching one of locations in the
1222           program flow equipped with a time test. Currently equipped with the
1223           test are the main checking loop, asynchronous DNS lookups, plugins
1224           which are calling external programs.  Rule evaluation is guarded by
1225           starting a timer (alarm) on each set of compiled rules.
1226
1227           When a message is passed to Mail::SpamAssassin::parse, a deadline
1228           time is established as a sum of current time and the "time_limit"
1229           setting.
1230
1231           This deadline may also be specified by a caller through an option
1232           'master_deadline' in $suppl_attrib on a call to parse(), possibly
1233           providing a more accurate deadline taking into account past and
1234           expected future processing of a message in a mail filtering setup.
1235           If both the config option as well as a 'master_deadline' option in
1236           a call are provided, the shorter time limit of the two is used
1237           (since version 3.3.2).  Note that spamd (and possibly third-party
1238           callers of SpamAssassin) will supply the 'master_deadline' option
1239           in a call based on its --timeout-child option (or equivalent),
1240           unlike the command line "spamassassin", which has no such command
1241           line option.
1242
1243           When a time limit is exceeded, most of the remaining tests will be
1244           skipped, as well as auto-learning. Whatever tests fired so far will
1245           determine the final score. The behaviour is similar to short-
1246           circuiting with attribute 'on', as implemented by a Shortcircuit
1247           plugin. A synthetic hit on a rule named TIME_LIMIT_EXCEEDED with a
1248           near-zero default score is generated, so that the report will
1249           reflect the event. A score for TIME_LIMIT_EXCEEDED may be provided
1250           explicitly in a configuration file, for example to achieve
1251           whitelisting or blacklisting effect for messages with long
1252           processing times.
1253
1254           The "time_limit" option is a useful protection against excessive
1255           processing time on certain degenerate or unusually long or complex
1256           mail messages, as well as against some DoS attacks. It is also
1257           needed in time-critical pre-queue filtering setups (e.g. milter,
1258           proxy, integration with MTA), where message processing must finish
1259           before a SMTP client times out.  RFC 5321 prescribes in section
1260           4.5.3.2.6 the 'DATA Termination' time limit of 10 minutes, although
1261           it is not unusual to see some SMTP clients abort sooner on waiting
1262           for a response. A sensible "time_limit" for a pre-queue filtering
1263           setup is maybe 50 seconds, assuming that clients are willing to
1264           wait at least a minute.
1265
1266       lock_method type
1267           Select the file-locking method used to protect database files on-
1268           disk. By default, SpamAssassin uses an NFS-safe locking method on
1269           UNIX; however, if you are sure that the database files you'll be
1270           using for Bayes and AWL storage will never be accessed over NFS, a
1271           non-NFS-safe locking system can be selected.
1272
1273           This will be quite a bit faster, but may risk file corruption if
1274           the files are ever accessed by multiple clients at once, and one or
1275           more of them is accessing them through an NFS filesystem.
1276
1277           Note that different platforms require different locking systems.
1278
1279           The supported locking systems for "type" are as follows:
1280
1281           nfssafe - an NFS-safe locking system
1282           flock - simple UNIX "flock()" locking
1283           win32 - Win32 locking using "sysopen (..., O_CREAT|O_EXCL)".
1284
1285           nfssafe and flock are only available on UNIX, and win32 is only
1286           available on Windows.  By default, SpamAssassin will choose either
1287           nfssafe or win32 depending on the platform in use.
1288
1289       fold_headers ( 0 | 1 )        (default: 1)
1290           By default, headers added by SpamAssassin will be whitespace
1291           folded.  In other words, they will be broken up into multiple lines
1292           instead of one very long one and each continuation line will have a
1293           tabulator prepended to mark it as a continuation of the preceding
1294           one.
1295
1296           The automatic wrapping can be disabled here.  Note that this can
1297           generate very long lines.  RFC 2822 required that header lines do
1298           not exceed 998 characters (not counting the final CRLF).
1299
1300       report_safe_copy_headers header_name ...
1301           If using "report_safe", a few of the headers from the original
1302           message are copied into the wrapper header (From, To, Cc, Subject,
1303           Date, etc.)  If you want to have other headers copied as well, you
1304           can add them using this option.  You can specify multiple headers
1305           on the same line, separated by spaces, or you can just use multiple
1306           lines.
1307
1308       envelope_sender_header Name-Of-Header
1309           SpamAssassin will attempt to discover the address used in the 'MAIL
1310           FROM:' phase of the SMTP transaction that delivered this message,
1311           if this data has been made available by the SMTP server.  This is
1312           used in the "EnvelopeFrom" pseudo-header, and for various rules
1313           such as SPF checking.
1314
1315           By default, various MTAs will use different headers, such as the
1316           following:
1317
1318               X-Envelope-From
1319               Envelope-Sender
1320               X-Sender
1321               Return-Path
1322
1323           SpamAssassin will attempt to use these, if some heuristics (such as
1324           the header placement in the message, or the absence of fetchmail
1325           signatures) appear to indicate that they are safe to use.  However,
1326           it may choose the wrong headers in some mailserver configurations.
1327           (More discussion of this can be found in bug 2142 and bug 4747 in
1328           the SpamAssassin BugZilla.)
1329
1330           To avoid this heuristic failure, the "envelope_sender_header"
1331           setting may be helpful.  Name the header that your MTA or MDA adds
1332           to messages containing the address used at the MAIL FROM step of
1333           the SMTP transaction.
1334
1335           If the header in question contains "<" or ">" characters at the
1336           start and end of the email address in the right-hand side, as in
1337           the SMTP transaction, these will be stripped.
1338
1339           If the header is not found in a message, or if it's value does not
1340           contain an "@" sign, SpamAssassin will issue a warning in the logs
1341           and fall back to its default heuristics.
1342
1343           (Note for MTA developers: we would prefer if the use of a single
1344           header be avoided in future, since that precludes 'downstream' spam
1345           scanning.
1346           "http://wiki.apache.org/spamassassin/EnvelopeSenderInReceived"
1347           details a better proposal, storing the envelope sender at each hop
1348           in the "Received" header.)
1349
1350           example:
1351
1352               envelope_sender_header X-SA-Exim-Mail-From
1353
1354       describe SYMBOLIC_TEST_NAME description ...
1355           Used to describe a test.  This text is shown to users in the
1356           detailed report.
1357
1358           Note that test names which begin with '__' are reserved for meta-
1359           match sub-rules, and are not scored or listed in the 'tests hit'
1360           reports.
1361
1362           Also note that by convention, rule descriptions should be limited
1363           in length to no more than 50 characters.
1364
1365       report_charset CHARSET        (default: unset)
1366           Set the MIME Content-Type charset used for the text/plain report
1367           which is attached to spam mail messages.
1368
1369       report ...some text for a report...
1370           Set the report template which is attached to spam mail messages.
1371           See the "10_default_prefs.cf" configuration file in
1372           "/usr/share/spamassassin" for an example.
1373
1374           If you change this, try to keep it under 78 columns. Each "report"
1375           line appends to the existing template, so use
1376           "clear_report_template" to restart.
1377
1378           Tags can be included as explained above.
1379
1380       clear_report_template
1381           Clear the report template.
1382
1383       report_contact ...text of contact address...
1384           Set what _CONTACTADDRESS_ is replaced with in the above report
1385           text.  By default, this is 'the administrator of that system',
1386           since the hostname of the system the scanner is running on is also
1387           included.
1388
1389       report_hostname ...hostname to use...
1390           Set what _HOSTNAME_ is replaced with in the above report text.  By
1391           default, this is determined dynamically as whatever the host
1392           running SpamAssassin calls itself.
1393
1394       unsafe_report ...some text for a report...
1395           Set the report template which is attached to spam mail messages
1396           which contain a non-text/plain part.  See the "10_default_prefs.cf"
1397           configuration file in "/usr/share/spamassassin" for an example.
1398
1399           Each "unsafe-report" line appends to the existing template, so use
1400           "clear_unsafe_report_template" to restart.
1401
1402           Tags can be used in this template (see above for details).
1403
1404       clear_unsafe_report_template
1405           Clear the unsafe_report template.
1406
1407       mbox_format_from_regex
1408           Set a specific regular expression to be used for mbox file From
1409           separators.
1410
1411           For example, this setting will allow sa-learn to process emails
1412           stored in a kmail 2 mbox:
1413
1414           mbox_format_from_regex /^From \S+  ?[[:upper:]][[:lower:]]{2}(?:,
1415           \d\d [[:upper:]][[:lower:]]{2} \d{4} [0-2]\d:\d\d:\d\d [+-]\d{4}|
1416           [[:upper:]][[:lower:]]{2} [ 1-3]\d [ 0-2]\d:\d\d:\d\d \d{4})/
1417
1418       parse_dkim_uris ( 0 | 1 ) (default: 1)
1419           If this option is set to 1 and the message contains DKIM headers,
1420           the headers will be parsed for URIs to process alongside URIs found
1421           in the body with some rules and modules (ex. URIDNSBL)
1422

RULE DEFINITIONS AND PRIVILEGED SETTINGS

1424       These settings differ from the ones above, in that they are considered
1425       'privileged'.  Only users running "spamassassin" from their
1426       procmailrc's or forward files, or sysadmins editing a file in
1427       "/etc/mail/spamassassin", can use them.   "spamd" users cannot use them
1428       in their "user_prefs" files, for security and efficiency reasons,
1429       unless "allow_user_rules" is enabled (and then, they may only add rules
1430       from below).
1431
1432       allow_user_rules ( 0 | 1 )         (default: 0)
1433           This setting allows users to create rules (and only rules) in their
1434           "user_prefs" files for use with "spamd". It defaults to off,
1435           because this could be a severe security hole. It may be possible
1436           for users to gain root level access if "spamd" is run as root. It
1437           is NOT a good idea, unless you have some other way of ensuring that
1438           users' tests are safe. Don't use this unless you are certain you
1439           know what you are doing. Furthermore, this option causes
1440           spamassassin to recompile all the tests each time it processes a
1441           message for a user with a rule in his/her "user_prefs" file, which
1442           could have a significant effect on server load. It is not
1443           recommended.
1444
1445           Note that it is not currently possible to use "allow_user_rules" to
1446           modify an existing system rule from a "user_prefs" file with
1447           "spamd".
1448
1449       redirector_pattern  /pattern/modifiers
1450           A regex pattern that matches both the redirector site portion, and
1451           the target site portion of a URI.
1452
1453           Note: The target URI portion must be surrounded in parentheses and
1454                 no other part of the pattern may create a backreference.
1455
1456           Example:
1457           http://chkpt.zdnet.com/chkpt/whatever/spammer.domain/yo/dude
1458
1459             redirector_pattern    /^https?:\/\/(?:opt\.)?chkpt\.zdnet\.com\/chkpt\/\w+\/(.*)$/i
1460
1461       header SYMBOLIC_TEST_NAME header op /pattern/modifiers [if-unset:
1462       STRING]
1463           Define a test.  "SYMBOLIC_TEST_NAME" is a symbolic test name, such
1464           as 'FROM_ENDS_IN_NUMS'.  "header" is the name of a mail header
1465           field, such as 'Subject', 'To', 'From', etc.  Header field names
1466           are matched case-insensitively (conforming to RFC 5322 section
1467           1.2.2), except for all-capitals metaheader fields such as ALL,
1468           MESSAGEID, ALL-TRUSTED.
1469
1470           Appending a modifier ":raw" to a header field name will inhibit
1471           decoding of quoted-printable or base-64 encoded strings, and will
1472           preserve all whitespace inside the header string.  The ":raw" may
1473           also be applied to pseudo-headers e.g. "ALL:raw" will return a
1474           pristine (unmodified) header section.
1475
1476           Appending a modifier ":addr" to a header field name will cause
1477           everything except the first email address to be removed from the
1478           header field.  It is mainly applicable to header fields 'From',
1479           'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
1480           the 'Return-Path'.
1481
1482           Appending a modifier ":name" to a header field name will cause
1483           everything except the first display name to be removed from the
1484           header field. It is mainly applicable to header fields containing a
1485           single mail address: 'From', 'Sender', along with their
1486           'Resent-From' and 'Resent-Sender' counterparts.
1487
1488           It is syntactically permitted to append more than one modifier to a
1489           header field name, although currently most combinations achieve no
1490           additional effect, for example "From:addr:raw" or "From:raw:addr"
1491           is currently the same as "From:addr" .
1492
1493           For example, appending ":addr" to a header name will result in
1494           example@foo in all of the following cases:
1495
1496           example@foo
1497           example@foo (Foo Blah)
1498           example@foo, example@bar
1499           display: example@foo (Foo Blah), example@bar ;
1500           Foo Blah <example@foo>
1501           "Foo Blah" <example@foo>
1502           "'Foo Blah'" <example@foo>
1503
1504           For example, appending ":name" to a header name will result in "Foo
1505           Blah" (without quotes) in all of the following cases:
1506
1507           example@foo (Foo Blah)
1508           example@foo (Foo Blah), example@bar
1509           display: example@foo (Foo Blah), example@bar ;
1510           Foo Blah <example@foo>
1511           "Foo Blah" <example@foo>
1512           "'Foo Blah'" <example@foo>
1513
1514           There are several special pseudo-headers that can be specified:
1515
1516           "ALL" can be used to mean the text of all the message's headers.
1517           Note that all whitespace inside the headers, at line folds, is
1518           currently compressed into a single space (' ') character. To obtain
1519           a pristine (unmodified) header section, use "ALL:raw" - the :raw
1520           modifier is documented above. Also similar that return headers
1521           added by specific relays: ALL-TRUSTED, ALL-INTERNAL, ALL-UNTRUSTED,
1522           ALL-EXTERNAL.
1523           "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
1524           headers.
1525           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
1526           SMTP transaction that delivered this message, if this data has been
1527           made available by the SMTP server.  See "envelope_sender_header"
1528           for more information on how to set this.
1529           "MESSAGEID" is a symbol meaning all Message-Id's found in the
1530           message; some mailing list software moves the real 'Message-Id' to
1531           'Resent-Message-Id' or to 'X-Message-Id', then uses its own one in
1532           the 'Message-Id' header. The value returned for this symbol is the
1533           text from all 3 headers, separated by newlines.
1534           "X-Spam-Relays-Untrusted", "X-Spam-Relays-Trusted",
1535           "X-Spam-Relays-Internal" and "X-Spam-Relays-External" represent a
1536           portable, pre-parsed representation of the message's network path,
1537           as recorded in the Received headers, divided into 'trusted' vs
1538           'untrusted' and 'internal' vs 'external' sets.  See
1539           "http://wiki.apache.org/spamassassin/TrustedRelays" for more
1540           details.
1541
1542           "op" is either "=~" (contains regular expression) or "!~" (does not
1543           contain regular expression), and "pattern" is a valid Perl regular
1544           expression, with "modifiers" as regexp modifiers in the usual
1545           style.   Note that multi-line rules are not supported, even if you
1546           use "x" as a modifier.  Also note that the "#" character must be
1547           escaped ("\#") or else it will be considered to be the start of a
1548           comment and not part of the regexp.
1549
1550           If the header specified matches multiple headers, their text will
1551           be concatenated with embedded \n's. Therefore you may wish to use
1552           "/m" if you use "^" or "$" in your regular expression.
1553
1554           If the "[if-unset: STRING]" tag is present, then "STRING" will be
1555           used if the header is not found in the mail message.
1556
1557           Test names must not start with a number, and must contain only
1558           alphanumerics and underscores.  It is suggested that lower-case
1559           characters not be used, and names have a length of no more than 22
1560           characters, as an informal convention.  Dashes are not allowed.
1561
1562           Note that test names which begin with '__' are reserved for meta-
1563           match sub-rules, and are not scored or listed in the 'tests hit'
1564           reports.  Test names which begin with 'T_' are reserved for tests
1565           which are undergoing QA, and these are given a very low score.
1566
1567           If you add or modify a test, please be sure to run a sanity check
1568           afterwards by running "spamassassin --lint".  This will avoid
1569           confusing error messages, or other tests being skipped as a side-
1570           effect.
1571
1572       header SYMBOLIC_TEST_NAME exists:header_field_name
1573           Define a header field existence test.  "header_field_name" is the
1574           name of a header field to test for existence.  Not to be confused
1575           with a test for a nonempty header field body, which can be
1576           implemented by a "header SYMBOLIC_TEST_NAME header =~ /\S/" rule as
1577           described above.
1578
1579       header SYMBOLIC_TEST_NAME eval:name_of_eval_method([arguments])
1580           Define a header eval test.  "name_of_eval_method" is the name of a
1581           method registered by a "Mail::SpamAssassin::Plugin" object.
1582           "arguments" are optional arguments to the function call.
1583
1584       header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone' [, 'sub-test'])
1585           Check a DNSBL (a DNS blacklist or whitelist).  This will retrieve
1586           Received: headers from the message, extract the IP addresses,
1587           select which ones are 'untrusted' based on the "trusted_networks"
1588           logic, and query that DNSBL zone.  There's a few things to note:
1589
1590           duplicated or private IPs
1591               Duplicated IPs are only queried once and reserved IPs are not
1592               queried.  Private IPs are those listed in
1593               <https://www.iana.org/assignments/ipv4-address-space>,
1594               <http://duxcw.com/faq/network/privip.htm>,
1595               <http://duxcw.com/faq/network/autoip.htm>, or
1596               <https://tools.ietf.org/html/rfc5735> as private.
1597
1598           the 'set' argument
1599               This is used as a 'zone ID'.  If you want to look up a
1600               multiple-meaning zone like SORBS, you can then query the
1601               results from that zone using it; but all check_rbl_sub() calls
1602               must use that zone ID.
1603
1604               Also, if more than one IP address gets a DNSBL hit for a
1605               particular rule, it does not affect the score because rules
1606               only trigger once per message.
1607
1608           the 'zone' argument
1609               This is the root zone of the DNSBL.
1610
1611               The domain name is considered to be a fully qualified domain
1612               name (i.e. not subject to DNS resolver's search or default
1613               domain options).  No trailing period is needed, and will be
1614               removed if specified.
1615
1616           the 'sub-test' argument
1617               This optional argument behaves the same as the sub-test
1618               argument in "check_rbl_sub()" below.
1619
1620           selecting all IPs except for the originating one
1621               This is accomplished by placing '-notfirsthop' at the end of
1622               the set name.  This is useful for querying against DNS lists
1623               which list dialup IP addresses; the first hop may be a dialup,
1624               but as long as there is at least one more hop, via their
1625               outgoing SMTP server, that's legitimate, and so should not gain
1626               points.  If there is only one hop, that will be queried anyway,
1627               as it should be relaying via its outgoing SMTP server instead
1628               of sending directly to your MX (mail exchange).
1629
1630           selecting IPs by whether they are trusted
1631               When checking a 'nice' DNSBL (a DNS whitelist), you cannot
1632               trust the IP addresses in Received headers that were not added
1633               by trusted relays.  To test the first IP address that can be
1634               trusted, place '-firsttrusted' at the end of the set name.
1635               That should test the IP address of the relay that connected to
1636               the most remote trusted relay.
1637
1638               Note that this requires that SpamAssassin know which relays are
1639               trusted.  For simple cases, SpamAssassin can make a good
1640               estimate.  For complex cases, you may get better results by
1641               setting "trusted_networks" manually.
1642
1643               In addition, you can test all untrusted IP addresses by placing
1644               '-untrusted' at the end of the set name.   Important note --
1645               this does NOT include the IP address from the most recent
1646               'untrusted line', as used in '-firsttrusted' above.  That's
1647               because we're talking about the trustworthiness of the IP
1648               address data, not the source header line, here; and in the case
1649               of the most recent header (the 'firsttrusted'), that data can
1650               be trusted.  See the Wiki page at
1651               "http://wiki.apache.org/spamassassin/TrustedRelays" for more
1652               information on this.
1653
1654           Selecting just the last external IP
1655               By using '-lastexternal' at the end of the set name, you can
1656               select only the external host that connected to your internal
1657               network, or at least the last external host with a public IP.
1658
1659       header SYMBOLIC_TEST_NAME eval:check_rbl_txt('set', 'zone')
1660           Same as check_rbl(), except querying using IN TXT instead of IN A
1661           records.  If the zone supports it, it will result in a line of text
1662           describing why the IP is listed, typically a hyperlink to a
1663           database entry.
1664
1665       header SYMBOLIC_TEST_NAME eval:check_rbl_sub('set', 'sub-test')
1666           Create a sub-test for 'set'.  If you want to look up a multi-
1667           meaning zone like relays.osirusoft.com, you can then query the
1668           results from that zone using the zone ID from the original query.
1669           The sub-test may either be an IPv4 dotted address for RBLs that
1670           return multiple A records, or a non-negative decimal number to
1671           specify a bitmask for RBLs that return a single A record containing
1672           a bitmask of results, or a regular expression.
1673
1674           Note: the set name must be exactly the same for as the main query
1675           rule, including selections like '-notfirsthop' appearing at the end
1676           of the set name.
1677
1678       body SYMBOLIC_TEST_NAME /pattern/modifiers
1679           Define a body pattern test.  "pattern" is a Perl regular
1680           expression.  Note: as per the header tests, "#" must be escaped
1681           ("\#") or else it is considered the beginning of a comment.
1682
1683           The 'body' in this case is the textual parts of the message body;
1684           any non-text MIME parts are stripped, and the message decoded from
1685           Quoted-Printable or Base-64-encoded format if necessary.  Parts
1686           declared as text/html will be rendered from HTML to text.
1687
1688           All body paragraphs (double-newline-separated blocks text) are
1689           turned into a line breaks removed, whitespace normalized single
1690           line.  Any lines longer than 2kB are split into shorter separate
1691           lines (from a boundary when possible), this may unexpectedly
1692           prevent pattern from matching.  Patterns are matched independently
1693           against each of these lines.
1694
1695           Note that by default the message Subject header is considered part
1696           of the body and becomes the first line when running the rules. If
1697           you don't want to match Subject along with body text, use "tflags
1698           RULENAME nosubject".
1699
1700       body SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1701           Define a body eval test.  See above.
1702
1703       uri SYMBOLIC_TEST_NAME /pattern/modifiers
1704           Define a uri pattern test.  "pattern" is a Perl regular expression.
1705           Note: as per the header tests, "#" must be escaped ("\#") or else
1706           it is considered the beginning of a comment.
1707
1708           The 'uri' in this case is a list of all the URIs in the body of the
1709           email, and the test will be run on each and every one of those
1710           URIs, adjusting the score if a match is found. Use this test
1711           instead of one of the body tests when you need to match a URI, as
1712           it is more accurately bound to the start/end points of the URI, and
1713           will also be faster.
1714
1715       rawbody SYMBOLIC_TEST_NAME /pattern/modifiers
1716           Define a raw-body pattern test.  "pattern" is a Perl regular
1717           expression.  Note: as per the header tests, "#" must be escaped
1718           ("\#") or else it is considered the beginning of a comment.
1719
1720           The 'raw body' of a message is the raw data inside all textual
1721           parts. The text will be decoded from base64 or quoted-printable
1722           encoding, but HTML tags and line breaks will still be present.
1723           Multiline expressions will need to be used to match strings that
1724           are broken by line breaks.
1725
1726           Note that the text is split into 2-4kB chunks (from a word boundary
1727           when possible), this may unexpectedly prevent pattern from
1728           matching.  Patterns are matched independently against each of these
1729           chunks.
1730
1731       rawbody SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1732           Define a raw-body eval test.  See above.
1733
1734       full SYMBOLIC_TEST_NAME /pattern/modifiers
1735           Define a full message pattern test.  "pattern" is a Perl regular
1736           expression.  Note: as per the header tests, "#" must be escaped
1737           ("\#") or else it is considered the beginning of a comment.
1738
1739           The full message is the pristine message headers plus the pristine
1740           message body, including all MIME data such as images, other
1741           attachments, MIME boundaries, etc.
1742
1743       full SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1744           Define a full message eval test.  See above.
1745
1746       meta SYMBOLIC_TEST_NAME boolean expression
1747           Define a boolean expression test in terms of other tests that have
1748           been hit or not hit.  For example:
1749
1750           meta META1        TEST1 && !(TEST2 || TEST3)
1751
1752           Note that English language operators ("and", "or") will be treated
1753           as rule names, and that there is no "XOR" operator.
1754
1755       meta SYMBOLIC_TEST_NAME boolean arithmetic expression
1756           Can also define an arithmetic expression in terms of other tests,
1757           with an unhit test having the value "0" and a hit test having a
1758           nonzero value.  The value of a hit meta test is that of its
1759           arithmetic expression.  The value of a hit eval test is that
1760           returned by its method.  The value of a hit header, body, rawbody,
1761           uri, or full test which has the "multiple" tflag is the number of
1762           times the test hit.  The value of any other type of hit test is
1763           "1".
1764
1765           For example:
1766
1767           meta META2        (3 * TEST1 - 2 * TEST2) > 0
1768
1769           Note that Perl builtins and functions, like "abs()", can't be used,
1770           and will be treated as rule names.
1771
1772           If you want to define a meta-rule, but do not want its individual
1773           sub-rules to count towards the final score unless the entire meta-
1774           rule matches, give the sub-rules names that start with '__' (two
1775           underscores).  SpamAssassin will ignore these for scoring.
1776
1777       meta SYMBOLIC_TEST_NAME ... rules_matching(RULEGLOB) ...
1778           Special function that will expand to list of matching rulenames.
1779           Can be used anywhere in expressions.  Argument supports glob style
1780           rulename matching (* = anything, ? = one character).  Matching is
1781           case-sensitive.
1782
1783           For example, this will hit if at least two __FOO_* rule hits:
1784
1785            body __FOO_1  /xxx/
1786            body __FOO_2  /yyy/
1787            body __FOO_3  /zzz/
1788            meta FOO_META  rules_matching(__FOO_*) >= 2
1789
1790           Which would be the same as:
1791
1792            meta FOO_META  (__FOO_1 + __FOO_2 + __FOO_3) >= 2
1793
1794       reuse SYMBOLIC_TEST_NAME [ OLD_SYMBOLIC_TEST_NAME_1 ... ]
1795           Defines the name of a test that should be "reused" during the
1796           scoring process. If a message has an X-Spam-Status header that
1797           shows a hit for this rule or any of the old rule names given, a hit
1798           will be added for this rule when mass-check --reuse is used.
1799           Examples:
1800
1801           "reuse SPF_PASS"
1802
1803           "reuse MY_NET_RULE_V2 MY_NET_RULE_V1"
1804
1805           The actual logic for reuse tests is done by
1806           Mail::SpamAssassin::Plugin::Reuse.
1807
1808       tflags SYMBOLIC_TEST_NAME flags
1809           Used to set flags on a test. Parameter is a space-separated list of
1810           flag names or flag name = value pairs.  These flags are used in the
1811           score-determination back end system for details of the test's
1812           behaviour.  Please see "bayes_auto_learn" for more information
1813           about tflag interaction with those systems. The following flags can
1814           be set:
1815
1816           net The test is a network test, and will not be run in the mass
1817               checking system or if -L is used, therefore its score should
1818               not be modified.
1819
1820           nice
1821               The test is intended to compensate for common false positives,
1822               and should be assigned a negative score.
1823
1824           userconf
1825               The test requires user configuration before it can be used
1826               (like language-specific tests).
1827
1828           learn
1829               The test requires training before it can be used.
1830
1831           noautolearn
1832               The test will explicitly be ignored when calculating the score
1833               for learning systems.
1834
1835           autolearn_force
1836               The test will be subject to less stringent autolearn
1837               thresholds.
1838
1839               Normally, SpamAssassin will require 3 points from the header
1840               and 3 points from the body to be auto-learned as spam. This
1841               option keeps the threshold at 6 points total but changes it to
1842               have no regard to the source of the points.
1843
1844           noawl
1845               This flag is specific when using AWL plugin.
1846
1847               Normally, AWL plugin normalizes scores via auto-whitelist. In
1848               some scenarios it works against the system administrator when
1849               trying to add some rules to correct miss-classified email. When
1850               AWL plugin searches the email and finds the noawl flag it will
1851               exit without normalizing the score nor storing the value in db.
1852
1853           multiple
1854               The test will be evaluated multiple times, for use with meta
1855               rules.  Only affects header, body, rawbody, uri, and full
1856               tests.
1857
1858           maxhits=N
1859               If multiple is specified, limit the number of hits found to N.
1860               If the rule is used in a meta that counts the hits (e.g.
1861               __RULENAME > 5), this is a way to avoid wasted extra work (use
1862               "tflags multiple maxhits=6").
1863
1864               For example:
1865
1866                 uri      __KAM_COUNT_URIS /^./
1867                 tflags   __KAM_COUNT_URIS multiple maxhits=16
1868                 describe __KAM_COUNT_URIS A multiple match used to count URIs in a message
1869
1870                 meta __KAM_HAS_0_URIS (__KAM_COUNT_URIS == 0)
1871                 meta __KAM_HAS_1_URIS (__KAM_COUNT_URIS >= 1)
1872                 meta __KAM_HAS_2_URIS (__KAM_COUNT_URIS >= 2)
1873                 meta __KAM_HAS_3_URIS (__KAM_COUNT_URIS >= 3)
1874                 meta __KAM_HAS_4_URIS (__KAM_COUNT_URIS >= 4)
1875                 meta __KAM_HAS_5_URIS (__KAM_COUNT_URIS >= 5)
1876                 meta __KAM_HAS_10_URIS (__KAM_COUNT_URIS >= 10)
1877                 meta __KAM_HAS_15_URIS (__KAM_COUNT_URIS >= 15)
1878
1879           nosubject
1880               Used only for body rules.  If specified, Subject header will
1881               not be a part of the matched body text.  See body for more
1882               info.
1883
1884           ips_only
1885               This flag is specific to rules invoking an URIDNSBL plugin, it
1886               is documented there.
1887
1888           domains_only
1889               This flag is specific to rules invoking an URIDNSBL plugin, it
1890               is documented there.
1891
1892           ns  This flag is specific to rules invoking an URIDNSBL plugin, it
1893               is documented there.
1894
1895           a   This flag is specific to rules invoking an URIDNSBL plugin, it
1896               is documented there.
1897
1898       priority SYMBOLIC_TEST_NAME n
1899           Assign a specific priority to a test.  All tests, except for DNS
1900           and Meta tests, are run in increasing priority value order
1901           (negative priority values are run before positive priority values).
1902           The default test priority is 0 (zero).
1903
1904           The values <-99999999999999> and <-99999999999998> have a special
1905           meaning internally, and should not be used.
1906

ADMINISTRATOR SETTINGS

1908       These settings differ from the ones above, in that they are considered
1909       'more privileged' -- even more than the ones in the PRIVILEGED SETTINGS
1910       section.  No matter what "allow_user_rules" is set to, these can never
1911       be set from a user's "user_prefs" file when spamc/spamd is being used.
1912       However, all settings can be used by local programs run directly by the
1913       user.
1914
1915       version_tag string
1916           This tag is appended to the SA version in the X-Spam-Status header.
1917           You should include it when you modify your ruleset, especially if
1918           you plan to distribute it.  A good choice for string is your last
1919           name or your initials followed by a number which you increase with
1920           each change.
1921
1922           The version_tag will be lowercased, and any non-alphanumeric or
1923           period character will be replaced by an underscore.
1924
1925           e.g.
1926
1927             version_tag myrules1    # version=2.41-myrules1
1928
1929       test SYMBOLIC_TEST_NAME (ok|fail) Some string to test against
1930           Define a regression testing string. You can have more than one
1931           regression test string per symbolic test name. Simply specify a
1932           string that you wish the test to match.
1933
1934           These tests are only run as part of the test suite - they should
1935           not affect the general running of SpamAssassin.
1936
1937       rbl_timeout t [t_min] [zone]       (default: 15 3)
1938           All DNS queries are made at the beginning of a check and we try to
1939           read the results at the end.  This value specifies the maximum
1940           period of time (in seconds) to wait for a DNS query.  If most of
1941           the DNS queries have succeeded for a particular message, then
1942           SpamAssassin will not wait for the full period to avoid wasting
1943           time on unresponsive server(s), but will shrink the timeout
1944           according to a percentage of queries already completed.  As the
1945           number of queries remaining approaches 0, the timeout value will
1946           gradually approach a t_min value, which is an optional second
1947           parameter and defaults to 0.2 * t.  If t is smaller than t_min, the
1948           initial timeout is set to t_min.  Here is a chart of queries
1949           remaining versus the timeout in seconds, for the default 15 second
1950           / 3 second timeout setting:
1951
1952             queries left  100%  90%  80%  70%  60%  50%  40%  30%  20%  10%   0%
1953             timeout        15   14.9 14.5 13.9 13.1 12.0 10.7  9.1  7.3  5.3  3
1954
1955           For example, if 20 queries are made at the beginning of a message
1956           check and 16 queries have returned (leaving 20%), the remaining 4
1957           queries should finish within 7.3 seconds since their query started
1958           or they will be timed out.  Note that timed out queries are only
1959           aborted when there is nothing else left for SpamAssassin to do -
1960           long evaluation of other rules may grant queries additional time.
1961
1962           If a parameter 'zone' is specified (it must end with a letter,
1963           which distinguishes it from other numeric parametrs), then the
1964           setting only applies to DNS queries against the specified DNS
1965           domain (host, domain or RBL (sub)zone).  Matching is case-
1966           insensitive, the actual domain may be a subdomain of the specified
1967           zone.
1968
1969       util_rb_tld tld1 tld2 ...
1970           This option maintains list of valid TLDs in the RegistryBoundaries
1971           code.  TLDs include things like com, net, org, etc.
1972
1973       util_rb_2tld 2tld-1.tld 2tld-2.tld ...
1974           This option maintains list of valid 2nd-level TLDs in the
1975           RegistryBoundaries code.  2TLDs include things like co.uk, fed.us,
1976           etc.
1977
1978       util_rb_3tld 3tld1.some.tld 3tld2.other.tld ...
1979           This option maintains list of valid 3rd-level TLDs in the
1980           RegistryBoundaries code.  3TLDs include things like demon.co.uk,
1981           plc.co.im, etc.
1982
1983       clear_util_rb
1984           Empty internal list of valid TLDs (including 2nd and 3rd level)
1985           which RegistryBoundaries code uses.  Only useful if you want to
1986           override the standard lists supplied by sa-update.
1987
1988       bayes_path /path/filename     (default: ~/.spamassassin/bayes)
1989           This is the directory and filename for Bayes databases.  Several
1990           databases will be created, with this as the base directory and
1991           filename, with "_toks", "_seen", etc. appended to the base.  The
1992           default setting results in files called
1993           "~/.spamassassin/bayes_seen", "~/.spamassassin/bayes_toks", etc.
1994
1995           By default, each user has their own in their "~/.spamassassin"
1996           directory with mode 0700/0600.  For system-wide SpamAssassin use,
1997           you may want to reduce disk space usage by sharing this across all
1998           users.  However, Bayes appears to be more effective with individual
1999           user databases.
2000
2001       bayes_file_mode          (default: 0700)
2002           The file mode bits used for the Bayesian filtering database files.
2003
2004           Make sure you specify this using the 'x' mode bits set, as it may
2005           also be used to create directories.  However, if a file is created,
2006           the resulting file will not have any execute bits set (the umask is
2007           set to 111). The argument is a string of octal digits, it is
2008           converted to a numeric value internally.
2009
2010       bayes_store_module Name::Of::BayesStore::Module
2011           If this option is set, the module given will be used as an
2012           alternate to the default bayes storage mechanism.  It must conform
2013           to the published storage specification (see
2014           Mail::SpamAssassin::BayesStore). For example, set this to
2015           Mail::SpamAssassin::BayesStore::SQL to use the generic SQL storage
2016           module.
2017
2018       bayes_sql_dsn DBI::databasetype:databasename:hostname:port
2019           Used for BayesStore::SQL storage implementation.
2020
2021           This option give the connect string used to connect to the SQL
2022           based Bayes storage.
2023
2024       bayes_sql_username
2025           Used by BayesStore::SQL storage implementation.
2026
2027           This option gives the username used by the above DSN.
2028
2029       bayes_sql_password
2030           Used by BayesStore::SQL storage implementation.
2031
2032           This option gives the password used by the above DSN.
2033
2034       bayes_sql_username_authorized ( 0 | 1 )  (default: 0)
2035           Whether to call the services_authorized_for_username plugin hook in
2036           BayesSQL.  If the hook does not determine that the user is allowed
2037           to use bayes or is invalid then then database will not be
2038           initialized.
2039
2040           NOTE: By default the user is considered invalid until a plugin
2041           returns a true value.  If you enable this, but do not have a proper
2042           plugin loaded, all users will turn up as invalid.
2043
2044           The username passed into the plugin can be affected by the
2045           bayes_sql_override_username config option.
2046
2047       user_scores_dsn DBI:databasetype:databasename:hostname:port
2048           If you load user scores from an SQL database, this will set the DSN
2049           used to connect.  Example: "DBI:mysql:spamassassin:localhost"
2050
2051           If you load user scores from an LDAP directory, this will set the
2052           DSN used to connect. You have to write the DSN as an LDAP URL, the
2053           components being the host and port to connect to, the base DN for
2054           the search, the scope of the search (base, one or sub), the single
2055           attribute being the multivalued attribute used to hold the
2056           configuration data (space separated pairs of key and value, just as
2057           in a file) and finally the filter being the expression used to
2058           filter out the wanted username. Note that the filter expression is
2059           being used in a sprintf statement with the username as the only
2060           parameter, thus is can hold a single __USERNAME__ expression. This
2061           will be replaced with the username.
2062
2063           Example:
2064           "ldap://localhost:389/dc=koehntopp,dc=de?saconfig?uid=__USERNAME__"
2065
2066       user_scores_sql_username username
2067           The authorized username to connect to the above DSN.
2068
2069       user_scores_sql_password password
2070           The password for the database username, for the above DSN.
2071
2072       user_scores_sql_custom_query query
2073           This option gives you the ability to create a custom SQL query to
2074           retrieve user scores and preferences.  In order to work correctly
2075           your query should return two values, the preference name and value,
2076           in that order.  In addition, there are several "variables" that you
2077           can use as part of your query, these variables will be substituted
2078           for the current values right before the query is run.  The current
2079           allowed variables are:
2080
2081           _TABLE_
2082               The name of the table where user scores and preferences are
2083               stored. Currently hardcoded to userpref, to change this value
2084               you need to create a new custom query with the new table name.
2085
2086           _USERNAME_
2087               The current user's username.
2088
2089           _MAILBOX_
2090               The portion before the @ as derived from the current user's
2091               username.
2092
2093           _DOMAIN_
2094               The portion after the @ as derived from the current user's
2095               username, this value may be null.
2096
2097           The query must be one continuous line in order to parse correctly.
2098
2099           Here are several example queries, please note that these are broken
2100           up for easy reading, in your config it should be one continuous
2101           line.
2102
2103           Current default query:
2104               "SELECT preference, value FROM _TABLE_ WHERE username =
2105               _USERNAME_ OR username = '@GLOBAL' ORDER BY username ASC"
2106
2107           Use global and then domain level defaults:
2108               "SELECT preference, value FROM _TABLE_ WHERE username =
2109               _USERNAME_ OR username = '@GLOBAL' OR username = '@~'||_DOMAIN_
2110               ORDER BY username ASC"
2111
2112           Maybe global prefs should override user prefs:
2113               "SELECT preference, value FROM _TABLE_ WHERE username =
2114               _USERNAME_ OR username = '@GLOBAL' ORDER BY username DESC"
2115
2116       user_scores_ldap_username
2117           This is the Bind DN used to connect to the LDAP server.  It
2118           defaults to the empty string (""), allowing anonymous binding to
2119           work.
2120
2121           Example: "cn=master,dc=koehntopp,dc=de"
2122
2123       user_scores_ldap_password
2124           This is the password used to connect to the LDAP server.  It
2125           defaults to the empty string ("").
2126
2127       user_scores_fallback_to_global        (default: 1)
2128           Fall back to global scores and settings if userprefs can't be
2129           loaded from SQL or LDAP, instead of passing the message through
2130           unprocessed.
2131
2132       loadplugin [Mail::SpamAssassin::Plugin::]ModuleName [/path/module.pm]
2133           Load a SpamAssassin plugin module.  The "ModuleName" is the perl
2134           module name, used to create the plugin object itself.
2135
2136           Module naming is strict, name must only contain alphanumeric
2137           characters or underscores.  File must have .pm extension.
2138
2139           "/path/module.pm" is the file to load, containing the module's perl
2140           code; if it's specified as a relative path, it's considered to be
2141           relative to the current configuration file.  If it is omitted, the
2142           module will be loaded using perl's search path (the @INC array).
2143
2144           See "Mail::SpamAssassin::Plugin" for more details on writing
2145           plugins.
2146
2147       tryplugin ModuleName [/path/module.pm]
2148           Same as "loadplugin", but silently ignored if the .pm file cannot
2149           be found in the filesystem.
2150
2151       ignore_always_matching_regexps         (Default: 0)
2152           Ignore any rule which contains a regexp which always matches.
2153           Currently only catches regexps which contain '||', or which begin
2154           or end with a '|'.  Also ignore rules with "some" combinatorial
2155           explosions.
2156

PREPROCESSING OPTIONS

2158       include filename
2159           Include configuration lines from "filename".   Relative paths are
2160           considered relative to the current configuration file or user
2161           preferences file.
2162
2163       if (boolean perl expression)
2164           Used to support conditional interpretation of the configuration
2165           file. Lines between this and a corresponding "else" or "endif" line
2166           will be ignored unless the expression evaluates as true (in the
2167           perl sense; that is, defined and non-0 and non-empty string).
2168
2169           The conditional accepts a limited subset of perl for security --
2170           just enough to perform basic arithmetic comparisons.  The following
2171           input is accepted:
2172
2173           numbers, whitespace, arithmetic operations and grouping
2174               Namely these characters and ranges:
2175
2176                 ( ) - + * / _ . , < = > ! ~ 0-9 whitespace
2177
2178           version
2179               This will be replaced with the version number of the currently-
2180               running SpamAssassin engine.  Note: The version used is in the
2181               internal SpamAssassin version format which is "x.yyyzzz", where
2182               x is major version, y is minor version, and z is maintenance
2183               version.  So 3.0.0 is 3.000000, and 3.4.80 is 3.004080.
2184
2185           perl_version
2186               (Introduced in 3.4.1)  This will be replaced with the version
2187               number of the currently-running perl engine.  Note: The version
2188               used is in the $] version format which is "x.yyyzzz", where x
2189               is major version, y is minor version, and z is maintenance
2190               version.  So 5.8.8 is 5.008008, and 5.10.0 is 5.010000. Use to
2191               protect rules that incorporate RE syntax elements introduced in
2192               later versions of perl, such as the "++" non-backtracking match
2193               introduced in perl 5.10. For example:
2194
2195                 # Avoid lint error on older perl installs
2196                 # Check SA version first to avoid warnings on checking perl_version on older SA
2197                 if version > 3.004001 && perl_version >= 5.018000
2198                   body  INVALID_RE_SYNTAX_IN_PERL_BEFORE_5_18  /(?[ \p{Thai} & \p{Digit} ])/
2199                 endif
2200
2201               Note that the above will still generate a warning on perl older
2202               than 5.10.0; to avoid that warning do this instead:
2203
2204                 # Avoid lint error on older perl installs
2205                 if can(Mail::SpamAssassin::Conf::perl_min_version_5010000)
2206                   body  INVALID_RE_SYNTAX_IN_PERL_5_8  /\w++/
2207                 endif
2208
2209               Warning: a can() test is only defined for perl 5.10.0!
2210
2211           plugin(Name::Of::Plugin)
2212               This is a function call that returns 1 if the plugin named
2213               "Name::Of::Plugin" is loaded, or "undef" otherwise.
2214
2215           has(Name::Of::Package::function_name)
2216               This is a function call that returns 1 if the perl package
2217               named "Name::Of::Package" includes a function called
2218               "function_name", or "undef" otherwise.  Note that packages can
2219               be SpamAssassin plugins or built-in classes, there's no
2220               difference in this respect.  Internally this invokes
2221               UNIVERSAL::can.
2222
2223           can(Name::Of::Package::function_name)
2224               This is a function call that returns 1 if the perl package
2225               named "Name::Of::Package" includes a function called
2226               "function_name" and that function returns a true value when
2227               called with no arguments, otherwise "undef" is returned.
2228
2229               Is similar to "has", except that it also calls the named
2230               function, testing its return value (unlike the perl function
2231               UNIVERSAL::can).  This makes it possible for a 'feature'
2232               function to determine its result value at run time.
2233
2234           If the end of a configuration file is reached while still inside a
2235           "if" scope, a warning will be issued, but parsing will restart on
2236           the next file.
2237
2238           For example:
2239
2240                   if (version > 3.000000)
2241                     header MY_FOO ...
2242                   endif
2243
2244                   loadplugin MyPlugin plugintest.pm
2245
2246                   if plugin (MyPlugin)
2247                     header MY_PLUGIN_FOO  eval:check_for_foo()
2248                     score  MY_PLUGIN_FOO  0.1
2249                   endif
2250
2251       ifplugin PluginModuleName
2252           An alias for "if plugin(PluginModuleName)".
2253
2254       else
2255           Used to support conditional interpretation of the configuration
2256           file. Lines between this and a corresponding "endif" line, will be
2257           ignored unless the conditional expression evaluates as false (in
2258           the perl sense; that is, not defined and not 0 and non-empty
2259           string).
2260
2261       require_version n.nnnnnn
2262           Indicates that the entire file, from this line on, requires a
2263           certain version of SpamAssassin to run.  If a different (older or
2264           newer) version of SpamAssassin tries to read the configuration from
2265           this file, it will output a warning instead, and ignore it.
2266
2267           Note: The version used is in the internal SpamAssassin version
2268           format which is "x.yyyzzz", where x is major version, y is minor
2269           version, and z is maintenance version.  So 3.0.0 is 3.000000, and
2270           3.4.80 is 3.004080.
2271

TEMPLATE TAGS

2273       The following "tags" can be used as placeholders in certain options.
2274       They will be replaced by the corresponding value when they are used.
2275
2276       Some tags can take an argument (in parentheses). The argument is
2277       optional, and the default is shown below.
2278
2279        _YESNO_           "Yes" for spam, "No" for nonspam (=ham)
2280        _YESNO(spam_str,ham_str)_  returns the first argument ("Yes" if missing)
2281                          for spam, and the second argument ("No" if missing) for ham
2282        _YESNOCAPS_       "YES" for spam, "NO" for nonspam (=ham)
2283        _YESNOCAPS(spam_str,ham_str)_  same as _YESNO(...)_, but uppercased
2284        _SCORE(PAD)_      message score, if PAD is included and is either spaces or
2285                          zeroes, then pad scores with that many spaces or zeroes
2286                          (default, none)  ie: _SCORE(0)_ makes 2.4 become 02.4,
2287                          _SCORE(00)_ is 002.4.  12.3 would be 12.3 and 012.3
2288                          respectively.
2289        _REQD_            message threshold
2290        _VERSION_         version (eg. 3.0.0 or 3.1.0-r26142-foo1)
2291        _SUBVERSION_      sub-version/code revision date (eg. 2004-01-10)
2292        _RULESVERSION_    comma-separated list of rules versions, retrieved from
2293                          an '# UPDATE version' comment in rules files; if there is
2294                          more than one set of rules (update channels) the order
2295                          is unspecified (currently sorted by names of files);
2296        _HOSTNAME_        hostname of the machine the mail was processed on
2297        _REMOTEHOSTNAME_  hostname of the machine the mail was sent from, only
2298                          available with spamd
2299        _REMOTEHOSTADDR_  ip address of the machine the mail was sent from, only
2300                          available with spamd
2301        _BAYES_           bayes score
2302        _TOKENSUMMARY_    number of new, neutral, spammy, and hammy tokens found
2303        _BAYESTC_         number of new tokens found
2304        _BAYESTCLEARNED_  number of seen tokens found
2305        _BAYESTCSPAMMY_   number of spammy tokens found
2306        _BAYESTCHAMMY_    number of hammy tokens found
2307        _HAMMYTOKENS(N)_  the N most significant hammy tokens (default, 5)
2308        _SPAMMYTOKENS(N)_ the N most significant spammy tokens (default, 5)
2309        _DATE_            rfc-2822 date of scan
2310        _STARS(*)_        one "*" (use any character) for each full score point
2311                          (note: limited to 50 'stars')
2312        _SENDERDOMAIN_    a domain name of the envelope sender address, lowercased
2313        _AUTHORDOMAIN_    a domain name of the author address (the From header
2314                          field), lowercased;  note that RFC 5322 allows a mail
2315                          message to have multiple authors - currently only the
2316                          domain name of the first email address is returned
2317        _RELAYSTRUSTED_   relays used and deemed to be trusted (see the
2318                          'X-Spam-Relays-Trusted' pseudo-header)
2319        _RELAYSUNTRUSTED_ relays used that can not be trusted (see the
2320                          'X-Spam-Relays-Untrusted' pseudo-header)
2321        _RELAYSINTERNAL_  relays used and deemed to be internal (see the
2322                          'X-Spam-Relays-Internal' pseudo-header)
2323        _RELAYSEXTERNAL_  relays used and deemed to be external (see the
2324                          'X-Spam-Relays-External' pseudo-header)
2325        _LASTEXTERNALIP_  IP address of client in the external-to-internal
2326                          SMTP handover
2327        _LASTEXTERNALRDNS_ reverse-DNS of client in the external-to-internal
2328                          SMTP handover
2329        _LASTEXTERNALHELO_ HELO string used by client in the external-to-internal
2330                          SMTP handover
2331        _AUTOLEARN_       autolearn status ("ham", "no", "spam", "disabled",
2332                          "failed", "unavailable")
2333        _AUTOLEARNSCORE_  portion of message score used by autolearn
2334        _TESTS(,)_        tests hit separated by "," (or other separator)
2335        _TESTSSCORES(,)_  as above, except with scores appended (eg. AWL=-3.0,...)
2336        _SUBTESTS(,)_     subtests (start with "__") hit separated by ","
2337                          (or other separator)
2338        _SUBTESTSCOLLAPSED(,)_ subtests (start with "__") hit separated by ","
2339                          (or other separator) with duplicated rules collapsed
2340        _DCCB_            DCC's "Brand"
2341        _DCCR_            DCC's results
2342        _PYZOR_           Pyzor results
2343        _RBL_             full results for positive RBL queries in DNS URI format
2344        _LANGUAGES_       possible languages of mail
2345        _PREVIEW_         content preview
2346        _REPORT_          terse report of tests hit (for header reports)
2347        _SUBJPREFIX_      subject prefix based on rules, to be prepended to Subject
2348                          header by SpamAssassin caller
2349        _SUMMARY_         summary of tests hit for standard report (for body reports)
2350        _CONTACTADDRESS_  contents of the 'report_contact' setting
2351        _HEADER(NAME)_    includes the value of a message header.  value is the same
2352                          as is found for header rules (see elsewhere in this doc)
2353        _TIMING_          timing breakdown report
2354        _ADDEDHEADERHAM_  resulting header fields as requested by add_header for spam
2355        _ADDEDHEADERSPAM_ resulting header fields as requested by add_header for ham
2356        _ADDEDHEADER_     same as ADDEDHEADERHAM for ham or ADDEDHEADERSPAM for spam
2357
2358       If a tag reference uses the name of a tag which is not in this list or
2359       defined by a loaded plugin, the reference will be left intact and not
2360       replaced by any value.
2361
2362       Additional, plugin specific, template tags can be found in the
2363       documentation for the following plugins:
2364
2365        L<Mail::SpamAssassin::Plugin::ASN>
2366        L<Mail::SpamAssassin::Plugin::AWL>
2367        L<Mail::SpamAssassin::Plugin::TxRep>
2368
2369       The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second
2370       argument which specifies a format.  See the HAMMYTOKENS/SPAMMYTOKENS
2371       TAG FORMAT section, below, for details.
2372
2373   HAMMYTOKENS/SPAMMYTOKENS TAG FORMAT
2374       The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second
2375       argument which specifies a format: "_SPAMMYTOKENS(N,FMT)_",
2376       "_HAMMYTOKENS(N,FMT)_" The following formats are available:
2377
2378       short
2379           Only the tokens themselves are listed.  For example, preference
2380           file entry:
2381
2382           "add_header all Spammy _SPAMMYTOKENS(2,short)_"
2383
2384           Results in message header:
2385
2386           "X-Spam-Spammy: remove.php, UD:jpg"
2387
2388           Indicating that the top two spammy tokens found are "remove.php"
2389           and "UD:jpg".  (The token itself follows the last colon, the text
2390           before the colon indicates something about the token.  "UD" means
2391           the token looks like it might be part of a domain name.)
2392
2393       compact
2394           The token probability, an abbreviated declassification distance
2395           (see example), and the token are listed.  For example, preference
2396           file entry:
2397
2398           "add_header all Spammy _SPAMMYTOKENS(2,compact)_"
2399
2400           Results in message header:
2401
2402           "0.989-6--remove.php, 0.988-+--UD:jpg"
2403
2404           Indicating that the probabilities of the top two tokens are 0.989
2405           and 0.988, respectively.  The first token has a declassification
2406           distance of 6, meaning that if the token had appeared in at least 6
2407           more ham messages it would not be considered spammy.  The "+" for
2408           the second token indicates a declassification distance greater than
2409           9.
2410
2411       long
2412           Probability, declassification distance, number of times seen in a
2413           ham message, number of times seen in a spam message, age and the
2414           token are listed.
2415
2416           For example, preference file entry:
2417
2418           "add_header all Spammy _SPAMMYTOKENS(2,long)_"
2419
2420           Results in message header:
2421
2422           "X-Spam-Spammy: 0.989-6--0h-4s--4d--remove.php,
2423           0.988-33--2h-25s--1d--UD:jpg"
2424
2425           In addition to the information provided by the compact option, the
2426           long option shows that the first token appeared in zero ham
2427           messages and four spam messages, and that it was last seen four
2428           days ago.  The second token appeared in two ham messages, 25 spam
2429           messages and was last seen one day ago.  (Unlike the "compact"
2430           option, the long option shows declassification distances that are
2431           greater than 9.)
2432

LOCALI[SZ]ATION

2434       A line starting with the text "lang xx" will only be interpreted if the
2435       user is in that locale, allowing test descriptions and templates to be
2436       set for that language.
2437
2438       The locales string should specify either both the language and country,
2439       e.g.  "lang pt_BR", or just the language, e.g. "lang de".
2440