1Mail::SpamAssassin::ConUfs(e3r)Contributed Perl DocumentMaatiilo:n:SpamAssassin::Conf(3)
2
3
4
6 Mail::SpamAssassin::Conf - SpamAssassin configuration file
7
9 # a comment
10
11 rewrite_header Subject *****SPAM*****
12
13 full PARA_A_2_C_OF_1618 /Paragraph .a.{0,10}2.{0,10}C. of S. 1618/i
14 describe PARA_A_2_C_OF_1618 Claims compliance with senate bill 1618
15
16 header FROM_HAS_MIXED_NUMS From =~ /\d+[a-z]+\d+\S*@/i
17 describe FROM_HAS_MIXED_NUMS From: contains numbers mixed in with letters
18
19 score A_HREF_TO_REMOVE 2.0
20
21 lang es describe FROM_FORGED_HOTMAIL Forzado From: simula ser de hotmail.com
22
23 lang pt_BR report O programa detetor de Spam ZOE [...]
24
26 SpamAssassin is configured using traditional UNIX-style configuration
27 files, loaded from the "/usr/share/spamassassin" and "/etc/mail/spamas‐
28 sassin" directories.
29
30 The following web page lists the most important configuration settings
31 used to configure SpamAssassin; novices are encouraged to read it
32 first:
33
34 http://wiki.apache.org/spamassassin/ImportantInitialConfigItems
35
37 The "#" character starts a comment, which continues until end of line.
38 NOTE: if the "#" character is to be used as part of a rule or configu‐
39 ration option, it must be escaped with a backslash. i.e.: "\#"
40
41 Whitespace in the files is not significant, but please note that start‐
42 ing a line with whitespace is deprecated, as we reserve its use for
43 multi-line rule definitions, at some point in the future.
44
45 Currently, each rule or configuration setting must fit on one-line;
46 multi-line settings are not supported yet.
47
48 File and directory paths can use "~" to refer to the user's home direc‐
49 tory, but no other shell-style path extensions such as globing or
50 "~user/" are supported.
51
52 Where appropriate below, default values are listed in parentheses.
53
55 The following options can be used in both site-wide ("local.cf") and
56 user-specific ("user_prefs") configuration files to customize how Spa‐
57 mAssassin handles incoming email messages.
58
59 SCORING OPTIONS
60
61 required_score n.nn (default: 5)
62 Set the score required before a mail is considered spam. "n.nn"
63 can be an integer or a real number. 5.0 is the default setting,
64 and is quite aggressive; it would be suitable for a single-user
65 setup, but if you're an ISP installing SpamAssassin, you should
66 probably set the default to be more conservative, like 8.0 or 10.0.
67 It is not recommended to automatically delete or discard messages
68 marked as spam, as your users will complain, but if you choose to
69 do so, only delete messages with an exceptionally high score such
70 as 15.0 or higher. This option was previously known as
71 "required_hits" and that name is still accepted, but is deprecated.
72
73 score SYMBOLIC_TEST_NAME n.nn [ n.nn n.nn n.nn ]
74 Assign scores (the number of points for a hit) to a given test.
75 Scores can be positive or negative real numbers or integers. "SYM‐
76 BOLIC_TEST_NAME" is the symbolic name used by SpamAssassin for that
77 test; for example, 'FROM_ENDS_IN_NUMS'.
78
79 If only one valid score is listed, then that score is always used
80 for a test.
81
82 If four valid scores are listed, then the score that is used
83 depends on how SpamAssassin is being used. The first score is used
84 when both Bayes and network tests are disabled (score set 0). The
85 second score is used when Bayes is disabled, but network tests are
86 enabled (score set 1). The third score is used when Bayes is
87 enabled and network tests are disabled (score set 2). The fourth
88 score is used when Bayes is enabled and network tests are enabled
89 (score set 3).
90
91 Setting a rule's score to 0 will disable that rule from running.
92
93 If any of the score values are surrounded by parenthesis '()', then
94 all of the scores in the line are considered to be relative to the
95 already set score. ie: '(3)' means increase the score for this
96 rule by 3 points in all score sets. '(3) (0) (3) (0)' means
97 increase the score for this rule by 3 in score sets 0 and 2 only.
98
99 If no score is given for a test by the end of the configuration, a
100 default score is assigned: a score of 1.0 is used for all tests,
101 except those who names begin with 'T_' (this is used to indicate a
102 rule in testing) which receive 0.01.
103
104 Note that test names which begin with '__' are indirect rules used
105 to compose meta-match rules and can also act as prerequisites to
106 other rules. They are not scored or listed in the 'tests hit'
107 reports, but assigning a score of 0 to an indirect rule will dis‐
108 able it from running.
109
110 WHITELIST AND BLACKLIST OPTIONS
111
112 whitelist_from add@ress.com
113 Used to whitelist sender addresses which send mail that is
114 often tagged (incorrectly) as spam.
115
116 Use of this setting is not recommended, since it blindly trusts
117 the message, which is routinely and easily forged by spammers
118 and phish senders. The recommended solution is to instead use
119 "whitelist_auth" or other authenticated whitelisting methods,
120 or "whitelist_from_rcvd".
121
122 Whitelist and blacklist addresses are now file-glob-style pat‐
123 terns, so "friend@somewhere.com", "*@isp.com", or
124 "*.domain.net" will all work. Specifically, "*" and "?" are
125 allowed, but all other metacharacters are not. Regular expres‐
126 sions are not used for security reasons.
127
128 Multiple addresses per line, separated by spaces, is OK. Mul‐
129 tiple "whitelist_from" lines is also OK.
130
131 The headers checked for whitelist addresses are as follows: if
132 "Resent-From" is set, use that; otherwise check all addresses
133 taken from the following set of headers:
134
135 Envelope-Sender
136 Resent-Sender
137 X-Envelope-From
138 From
139
140 In addition, the "envelope sender" data, taken from the SMTP
141 envelope data where this is available, is looked up. See
142 "envelope_sender_header".
143
144 e.g.
145
146 whitelist_from joe@example.com fred@example.com
147 whitelist_from *@example.com
148
149 unwhitelist_from add@ress.com
150 Used to override a default whitelist_from entry, so for example
151 a distribution whitelist_from can be overridden in a local.cf
152 file, or an individual user can override a whitelist_from entry
153 in their own "user_prefs" file. The specified email address
154 has to match exactly the address previously used in a
155 whitelist_from line.
156
157 e.g.
158
159 unwhitelist_from joe@example.com fred@example.com
160 unwhitelist_from *@example.com
161
162 whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
163 Use this to supplement the whitelist_from addresses with a
164 check against the Received headers. The first parameter is the
165 address to whitelist, and the second is a string to match the
166 relay's rDNS.
167
168 This string is matched against the reverse DNS lookup used dur‐
169 ing the handover from the internet to your internal network's
170 mail exchangers. It can either be the full hostname, or the
171 domain component of that hostname. In other words, if the host
172 that connected to your MX had an IP address that mapped to
173 'sendinghost.spamassassin.org', you should specify "send‐
174 inghost.spamassassin.org" or just "spamassassin.org" here.
175
176 Note that this requires that "internal_networks" be correct.
177 For simple cases, it will be, but for a complex network you may
178 get better results by setting that parameter.
179
180 It also requires that your mail exchangers be configured to
181 perform DNS reverse lookups on the connecting host's IP
182 address, and to record the result in the generated Received:
183 header.
184
185 e.g.
186
187 whitelist_from_rcvd joe@example.com example.com
188 whitelist_from_rcvd *@axkit.org sergeant.org
189
190 def_whitelist_from_rcvd addr@lists.sourceforge.net sourceforge.net
191 Same as "whitelist_from_rcvd", but used for the default
192 whitelist entries in the SpamAssassin distribution. The
193 whitelist score is lower, because these are often targets for
194 spammer spoofing.
195
196 whitelist_allows_relays add@ress.com
197 Specify addresses which are in "whitelist_from_rcvd" that some‐
198 times send through a mail relay other than the listed ones. By
199 default mail with a From address that is in
200 "whitelist_from_rcvd" that does not match the relay will trig‐
201 ger a forgery rule. Including the address in
202 "whitelist_allows_relay" prevents that.
203
204 Whitelist and blacklist addresses are now file-glob-style pat‐
205 terns, so "friend@somewhere.com", "*@isp.com", or
206 "*.domain.net" will all work. Specifically, "*" and "?" are
207 allowed, but all other metacharacters are not. Regular expres‐
208 sions are not used for security reasons.
209
210 Multiple addresses per line, separated by spaces, is OK. Mul‐
211 tiple "whitelist_allows_relays" lines is also OK.
212
213 The specified email address does not have to match exactly the
214 address previously used in a whitelist_from_rcvd line as it is
215 compared to the address in the header.
216
217 e.g.
218
219 whitelist_allows_relays joe@example.com fred@example.com
220 whitelist_allows_relays *@example.com
221
222 unwhitelist_from_rcvd add@ress.com
223 Used to override a default whitelist_from_rcvd entry, so for
224 example a distribution whitelist_from_rcvd can be overridden in
225 a local.cf file, or an individual user can override a
226 whitelist_from_rcvd entry in their own "user_prefs" file.
227
228 The specified email address has to match exactly the address
229 previously used in a whitelist_from_rcvd line.
230
231 e.g.
232
233 unwhitelist_from_rcvd joe@example.com fred@example.com
234 unwhitelist_from_rcvd *@axkit.org
235
236 blacklist_from add@ress.com
237 Used to specify addresses which send mail that is often tagged
238 (incorrectly) as non-spam, but which the user doesn't want.
239 Same format as "whitelist_from".
240
241 unblacklist_from add@ress.com
242 Used to override a default blacklist_from entry, so for example
243 a distribution blacklist_from can be overridden in a local.cf
244 file, or an individual user can override a blacklist_from entry
245 in their own "user_prefs" file. The specified email address has
246 to match exactly the address previously used in a black‐
247 list_from line.
248
249 e.g.
250
251 unblacklist_from joe@example.com fred@example.com
252 unblacklist_from *@spammer.com
253
254 whitelist_to add@ress.com
255 If the given address appears as a recipient in the message
256 headers (Resent-To, To, Cc, obvious envelope recipient, etc.)
257 the mail will be whitelisted. Useful if you're deploying Spa‐
258 mAssassin system-wide, and don't want some users to have their
259 mail filtered. Same format as "whitelist_from".
260
261 There are three levels of To-whitelisting, "whitelist_to",
262 "more_spam_to" and "all_spam_to". Users in the first level may
263 still get some spammish mails blocked, but users in
264 "all_spam_to" should never get mail blocked.
265
266 The headers checked for whitelist addresses are as follows: if
267 "Resent-To" or "Resent-Cc" are set, use those; otherwise check
268 all addresses taken from the following set of headers:
269
270 To
271 Cc
272 Apparently-To
273 Delivered-To
274 Envelope-Recipients
275 Apparently-Resent-To
276 X-Envelope-To
277 Envelope-To
278 X-Delivered-To
279 X-Original-To
280 X-Rcpt-To
281 X-Real-To
282
283 more_spam_to add@ress.com
284 See above.
285
286 all_spam_to add@ress.com
287 See above.
288
289 blacklist_to add@ress.com
290 If the given address appears as a recipient in the message
291 headers (Resent-To, To, Cc, obvious envelope recipient, etc.)
292 the mail will be blacklisted. Same format as "blacklist_from".
293
294 whitelist_auth add@ress.com
295 Used to specify addresses which send mail that is often tagged
296 (incorrectly) as spam. This is different from "whitelist_from"
297 and "whitelist_from_rcvd" in that it first verifies that the
298 message was sent by an authorized sender for the address,
299 before whitelisting.
300
301 Authorization is performed using one of the installed sender-
302 authorization schemes: SPF (using "Mail::SpamAssassin::Plug‐
303 ins::SPF"), Domain Keys (using "Mail::SpamAssassin::Plug‐
304 ins::DomainKeys"), or DKIM (using "Mail::SpamAssassin::Plug‐
305 ins::DKIM"). Note that those plugins must be active, and work‐
306 ing, for this to operate.
307
308 Using "whitelist_auth" is roughly equivalent to specifying
309 duplicate "whitelist_from_spf", "whitelist_from_dk", and
310 "whitelist_from_dkim" lines for each of the addresses speci‐
311 fied.
312
313 e.g.
314
315 whitelist_auth joe@example.com fred@example.com
316 whitelist_auth *@example.com
317
318 def_whitelist_auth add@ress.com
319 Same as "whitelist_auth", but used for the default whitelist
320 entries in the SpamAssassin distribution. The whitelist score
321 is lower, because these are often targets for spammer spoofing.
322
323 unwhitelist_auth add@ress.com
324 Used to override a "whitelist_auth" entry. The specified email
325 address has to match exactly the address previously used in a
326 "whitelist_auth" line.
327
328 e.g.
329
330 unwhitelist_auth joe@example.com fred@example.com
331 unwhitelist_auth *@example.com
332
333 BASIC MESSAGE TAGGING OPTIONS
334
335 rewrite_header { subject ⎪ from ⎪ to } STRING
336 By default, suspected spam messages will not have the "Sub‐
337 ject", "From" or "To" lines tagged to indicate spam. By setting
338 this option, the header will be tagged with "STRING" to indi‐
339 cate that a message is spam. For the From or To headers, this
340 will take the form of an RFC 2822 comment following the address
341 in parantheses. For the Subject header, this will be prepended
342 to the original subject. Note that you should only use the
343 _REQD_ and _SCORE_ tags when rewriting the Subject header if
344 "report_safe" is 0. Otherwise, you may not be able to remove
345 the SpamAssassin markup via the normal methods. More informa‐
346 tion about tags is explained below in the TEMPLATE TAGS sec‐
347 tion.
348
349 Parentheses are not permitted in STRING if rewriting the From
350 or To headers. (They will be converted to square brackets.)
351
352 If "rewrite_header subject" is used, but the message being
353 rewritten does not already contain a "Subject" header, one will
354 be created.
355
356 A null value for "STRING" will remove any existing rewrite for
357 the specified header.
358
359 add_header { spam ⎪ ham ⎪ all } header_name string
360 Customized headers can be added to the specified type of mes‐
361 sages (spam, ham, or "all" to add to either). All headers
362 begin with "X-Spam-" (so a "header_name" Foo will generate a
363 header called X-Spam-Foo). header_name is restricted to the
364 character set [A-Za-z0-9_-].
365
366 "string" can contain tags as explained below in the TEMPLATE
367 TAGS section. You can also use "\n" and "\t" in the header to
368 add newlines and tabulators as desired. A backslash has to be
369 written as \\, any other escaped chars will be silently
370 removed.
371
372 All headers will be folded if fold_headers is set to 1. Note:
373 Manually adding newlines via "\n" disables any further auto‐
374 matic wrapping (ie: long header lines are possible). The lines
375 will still be properly folded (marked as continuing) though.
376
377 You can customize existing headers with add_header (only the
378 specified subset of messages will be changed).
379
380 See also "clear_headers" for removing headers.
381
382 Here are some examples (these are the defaults, note that
383 Checker-Version can not be changed or removed):
384
385 add_header spam Flag _YESNOCAPS_
386 add_header all Status _YESNO_, score=_SCORE_ required=_REQD_ tests=_TESTS_ autolearn=_AUTOLEARN_ version=_VERSION_
387 add_header all Level _STARS(*)_
388 add_header all Checker-Version SpamAssassin _VERSION_ (_SUBVERSION_) on _HOSTNAME_
389
390 remove_header { spam ⎪ ham ⎪ all } header_name
391 Headers can be removed from the specified type of messages
392 (spam, ham, or "all" to remove from either). All headers begin
393 with "X-Spam-" (so "header_name" will be appended to
394 "X-Spam-").
395
396 See also "clear_headers" for removing all the headers at once.
397
398 Note that X-Spam-Checker-Version is not removable because the
399 version information is needed by mail administrators and devel‐
400 opers to debug problems. Without at least one header, it might
401 not even be possible to determine that SpamAssassin is running.
402
403 clear_headers
404 Clear the list of headers to be added to messages. You may use
405 this before any add_header options to prevent the default head‐
406 ers from being added to the message.
407
408 Note that X-Spam-Checker-Version is not removable because the
409 version information is needed by mail administrators and devel‐
410 opers to debug problems. Without at least one header, it might
411 not even be possible to determine that SpamAssassin is running.
412
413 report_safe ( 0 ⎪ 1 ⎪ 2 ) (default: 1)
414 if this option is set to 1, if an incoming message is tagged as
415 spam, instead of modifying the original message, SpamAssassin
416 will create a new report message and attach the original mes‐
417 sage as a message/rfc822 MIME part (ensuring the original mes‐
418 sage is completely preserved, not easily opened, and easier to
419 recover).
420
421 If this option is set to 2, then original messages will be
422 attached with a content type of text/plain instead of mes‐
423 sage/rfc822. This setting may be required for safety reasons
424 on certain broken mail clients that automatically load attach‐
425 ments without any action by the user. This setting may also
426 make it somewhat more difficult to extract or view the original
427 message.
428
429 If this option is set to 0, incoming spam is only modified by
430 adding some "X-Spam-" headers and no changes will be made to
431 the body. In addition, a header named X-Spam-Report will be
432 added to spam. You can use the remove_header option to remove
433 that header after setting report_safe to 0.
434
435 See report_safe_copy_headers if you want to copy headers from
436 the original mail into tagged messages.
437
438 LANGUAGE OPTIONS
439
440 ok_locales xx [ yy zz ... ] (default: all)
441 This option is used to specify which locales are considered OK
442 for incoming mail. Mail using the character sets that are
443 allowed by this option will not be marked as possibly being
444 spam in a foreign language.
445
446 If you receive lots of spam in foreign languages, and never get
447 any non-spam in these languages, this may help. Note that all
448 ISO-8859-* character sets, and Windows code page character
449 sets, are always permitted by default.
450
451 Set this to "all" to allow all character sets. This is the
452 default.
453
454 The rules "CHARSET_FARAWAY", "CHARSET_FARAWAY_BODY", and
455 "CHARSET_FARAWAY_HEADERS" are triggered based on how this is
456 set.
457
458 Examples:
459
460 ok_locales all (allow all locales)
461 ok_locales en (only allow English)
462 ok_locales en ja zh (allow English, Japanese, and Chinese)
463
464 Note: if there are multiple ok_locales lines, only the last one
465 is used.
466
467 Select the locales to allow from the list below:
468
469 en - Western character sets in general
470 ja - Japanese character sets
471 ko - Korean character sets
472 ru - Cyrillic character sets
473 th - Thai character sets
474 zh - Chinese (both simplified and traditional) character sets
475 normalize_charset ( 0 ⎪ 1) (default: 0)
476 Whether to detect character sets and normalize message content
477 to Unicode. Requires the Encode::Detect module, HTML::Parser
478 version 3.46 or later, and Perl 5.8.5 or later.
479
480 NETWORK TEST OPTIONS
481
482 trusted_networks ip.add.re.ss[/mask] ... (default: none)
483 What networks or hosts are 'trusted' in your setup. Trusted in
484 this case means that relay hosts on these networks are consid‐
485 ered to not be potentially operated by spammers, open relays,
486 or open proxies. A trusted host could conceivably relay spam,
487 but will not originate it, and will not forge header data. DNS
488 blacklist checks will never query for hosts on these networks.
489
490 See "http://wiki.apache.org/spamassassin/TrustPath" for more
491 information.
492
493 MXes for your domain(s) and internal relays should also be
494 specified using the "internal_networks" setting. When there are
495 'trusted' hosts that are not MXes or internal relays for your
496 domain(s) they should only be specified in "trusted_networks".
497
498 If a "/mask" is specified, it's considered a CIDR-style 'net‐
499 mask', specified in bits. If it is not specified, but less
500 than 4 octets are specified with a trailing dot, that's consid‐
501 ered a mask to allow all addresses in the remaining octets. If
502 a mask is not specified, and there is not trailing dot, then
503 just the single IP address specified is used, as if the mask
504 was "/32".
505
506 If a network or host address is prefaced by a "!" the network
507 or host will be excluded (or included) in a first listed match
508 fashion.
509
510 Note: 127/8 is always included in trusted_networks, regardless
511 of your config.
512
513 Examples:
514
515 trusted_networks 192.168/16 # all in 192.168.*.*
516 trusted_networks 212.17.35.15 # just that host
517 trusted_networks !10.0.1.5 10.0.1/24 # all in 10.0.1.* but not 10.0.1.5
518
519 This operates additively, so a "trusted_networks" line after
520 another one will result in all those networks becoming trusted.
521 To clear out the existing entries, use "clear_trusted_net‐
522 works".
523
524 If "trusted_networks" is not set and "internal_networks" is,
525 the value of "internal_networks" will be used for this parame‐
526 ter.
527
528 If neither "trusted_networks" or "internal_networks" is set, a
529 basic inference algorithm is applied. This works as follows:
530
531 * If the 'from' host has an IP address in a private (RFC
532 1918) network range, then it's trusted
533
534 * If there are authentication tokens in the received header,
535 and the previous host was trusted, then this host is also
536 trusted
537
538 * Otherwise this host, and all further hosts, are consider
539 untrusted.
540
541 clear_trusted_networks
542 Empty the list of trusted networks.
543
544 internal_networks ip.add.re.ss[/mask] ... (default: none)
545 What networks or hosts are 'internal' in your setup. Internal
546 means that relay hosts on these networks are considered to be
547 MXes for your domain(s), or internal relays. This uses the
548 same format as "trusted_networks", above.
549
550 This value is used when checking 'dial-up' or dynamic IP
551 address blocklists, in order to detect direct-to-MX spamming.
552
553 Trusted relays that accept mail directly from dial-up connec‐
554 tions should not be listed in "internal_networks". List them
555 only in "trusted_networks".
556
557 If "trusted_networks" is set and "internal_networks" is not,
558 the value of "trusted_networks" will be used for this parame‐
559 ter.
560
561 If neither "trusted_networks" or "internal_networks" is set, no
562 addresses will be considered local; in other words, any relays
563 past the machine where SpamAssassin is running will be consid‐
564 ered external.
565
566 Every entry in "internal_networks" must appear in "trusted_net‐
567 works"; in other words, "internal_networks" is always a subset
568 of the trusted set.
569
570 Note: 127/8 is always included in internal_networks, regardless
571 of your config.
572
573 clear_internal_networks
574 Empty the list of internal networks.
575
576 msa_networks ip.add.re.ss[/mask] ... (default: none)
577 The networks or hosts are acting as MSAs in your setup. MSA
578 means that the relay hosts on these networks accept mail from
579 your own users and authenticates them appropriately. These
580 relays will never accept mail from hosts that aren't authenti‐
581 cated in some way. Examples of authentication include, IP
582 lists, SMTP AUTH, POP-before-SMTP, etc.
583
584 All relays found in the message headers after the MSA relay
585 will take on the same trusted and internal classifcations as
586 the MSA relay itself, as defined by your trusted_networks and
587 internal_networks configuration.
588
589 For example, if the MSA relay is trusted and internal so will
590 all of the relays that precede it.
591
592 When using msa_networks to identify an MSA it is recommended
593 that you treat that MSA as both trusted and internal. When an
594 MSA is not included in msa_networks you should treat the MSA as
595 trusted but not internal, however if the MSA is also acting as
596 an MX or intermediate relay you must always treat it as both
597 trusted and internal and ensure that the MSA includes visible
598 auth tokens in its Received header to identify submission
599 clients.
600
601 Warning: Never include an MSA that also acts as an MX (or is
602 also an intermediate relay for an MX) or otherwise accepts mail
603 from non-authenticated users in msa_networks. Doing so will
604 result in unknown external relays being trusted.
605
606 clear_msa_networks
607 Empty the list of msa networks.
608
609 always_trust_envelope_sender ( 0 ⎪ 1 ) (default: 0)
610 Trust the envelope sender even if the message has been passed
611 through one or more trusted relays. See also "enve‐
612 lope_sender_header".
613
614 skip_rbl_checks ( 0 ⎪ 1 ) (default: 0)
615 By default, SpamAssassin will run RBL checks. If your ISP
616 already does this for you, set this to 1.
617
618 dns_available { yes ⎪ test[: name1 name2...] ⎪ no } (default:
619 test)
620 By default, SpamAssassin will query some default hosts on the
621 internet to attempt to check if DNS is working or not. The
622 problem is that it can introduce some delay if your network
623 connection is down, and in some cases it can wrongly guess that
624 DNS is unavailable because the test connections failed. Spa‐
625 mAssassin includes a default set of 13 servers, among which 3
626 are picked randomly.
627
628 You can however specify your own list by specifying
629
630 dns_available test: domain1.tld domain2.tld domain3.tld
631
632 Please note, the DNS test queries for NS records.
633
634 SpamAssassin's network rules are run in parallel. This can
635 cause overhead in terms of the number of file descriptors
636 required; it is recommended that the minimum limit on file
637 descriptors be raised to at least 256 for safety.
638
639 dns_test_interval n (default: 600 seconds)
640 If dns_available is set to 'test' (which is the default), the
641 dns_test_interval time in number of seconds will tell SpamAs‐
642 sassin how often to retest for working DNS.
643
644 dns_options rotate (default: empty)
645 If set to 'rotate', this causes SpamAssassin to choose a DNS
646 server at random from all servers listed in "/etc/resolv.conf"
647 every 'dns_test_interval' seconds, effectively spreading the
648 load over all currently available DNS servers when there are
649 many spamd workers.
650
651 LEARNING OPTIONS
652
653 use_bayes ( 0 ⎪ 1 ) (default: 1)
654 Whether to use the naive-Bayesian-style classifier built into
655 SpamAssassin. This is a master on/off switch for all Bayes-
656 related operations.
657
658 use_bayes_rules ( 0 ⎪ 1 ) (default: 1)
659 Whether to use rules using the naive-Bayesian-style classifier
660 built into SpamAssassin. This allows you to disable the rules
661 while leaving auto and manual learning enabled.
662
663 bayes_auto_learn ( 0 ⎪ 1 ) (default: 1)
664 Whether SpamAssassin should automatically feed high-scoring
665 mails (or low-scoring mails, for non-spam) into its learning
666 systems. The only learning system supported currently is a
667 naive-Bayesian-style classifier.
668
669 See the documentation for the "Mail::SpamAssassin::Plug‐
670 in::AutoLearnThreshold" plugin module for details on how Bayes
671 auto-learning is implemented by default.
672
673 bayes_ignore_header header_name
674 If you receive mail filtered by upstream mail systems, like a
675 spam-filtering ISP or mailing list, and that service adds new
676 headers (as most of them do), these headers may provide inap‐
677 propriate cues to the Bayesian classifier, allowing it to take
678 a "short cut". To avoid this, list the headers using this set‐
679 ting. Example:
680
681 bayes_ignore_header X-Upstream-Spamfilter
682 bayes_ignore_header X-Upstream-SomethingElse
683
684 bayes_ignore_from add@ress.com
685 Bayesian classification and autolearning will not be performed
686 on mail from the listed addresses. Program "sa-learn" will
687 also ignore the listed addresses if it is invoked using the
688 "--use-ignores" option. One or more addresses can be listed,
689 see "whitelist_from".
690
691 Spam messages from certain senders may contain many words that
692 frequently occur in ham. For example, one might read messages
693 from a preferred bookstore but also get unwanted spam messages
694 from other bookstores. If the unwanted messages are learned as
695 spam then any messages discussing books, including the pre‐
696 ferred bookstore and antiquarian messages would be in danger of
697 being marked as spam. The addresses of the annoying bookstores
698 would be listed. (Assuming they were halfway legitimate and
699 didn't send you mail through myriad affiliates.)
700
701 Those who have pieces of spam in legitimate messages or other‐
702 wise receive ham messages containing potentially spammy words
703 might fear that some spam messages might be in danger of being
704 marked as ham. The addresses of the spam mailing lists, corre‐
705 spondents, etc. would be listed.
706
707 bayes_ignore_to add@ress.com
708 Bayesian classification and autolearning will not be performed
709 on mail to the listed addresses. See "bayes_ignore_from" for
710 details.
711
712 bayes_min_ham_num (Default: 200)
713 bayes_min_spam_num (Default: 200)
714 To be accurate, the Bayes system does not activate until a cer‐
715 tain number of ham (non-spam) and spam have been learned. The
716 default is 200 of each ham and spam, but you can tune these up
717 or down with these two settings.
718
719 bayes_learn_during_report (Default: 1)
720 The Bayes system will, by default, learn any reported messages
721 ("spamassassin -r") as spam. If you do not want this to hap‐
722 pen, set this option to 0.
723
724 bayes_sql_override_username
725 Used by BayesStore::SQL storage implementation.
726
727 If this options is set the BayesStore::SQL module will override
728 the set username with the value given. This could be useful
729 for implementing global or group bayes databases.
730
731 bayes_use_hapaxes (default: 1)
732 Should the Bayesian classifier use hapaxes (words/tokens that
733 occur only once) when classifying? This produces significantly
734 better hit-rates, but increases database size by about a factor
735 of 8 to 10.
736
737 bayes_journal_max_size (default: 102400)
738 SpamAssassin will opportunistically sync the journal and the
739 database. It will do so once a day, but will sync more often
740 if the journal file size goes above this setting, in bytes. If
741 set to 0, opportunistic syncing will not occur.
742
743 bayes_expiry_max_db_size (default: 150000)
744 What should be the maximum size of the Bayes tokens database?
745 When expiry occurs, the Bayes system will keep either 75% of
746 the maximum value, or 100,000 tokens, whichever has a larger
747 value. 150,000 tokens is roughly equivalent to a 8Mb database
748 file.
749
750 bayes_auto_expire (default: 1)
751 If enabled, the Bayes system will try to automatically expire
752 old tokens from the database. Auto-expiry occurs when the num‐
753 ber of tokens in the database surpasses the
754 bayes_expiry_max_db_size value.
755
756 bayes_learn_to_journal (default: 0)
757 If this option is set, whenever SpamAssassin does Bayes learn‐
758 ing, it will put the information into the journal instead of
759 directly into the database. This lowers contention for locking
760 the database to execute an update, but will also cause more
761 access to the journal and cause a delay before the updates are
762 actually committed to the Bayes database.
763
764 MISCELLANEOUS OPTIONS
765
766 lock_method type
767 Select the file-locking method used to protect database files
768 on-disk. By default, SpamAssassin uses an NFS-safe locking
769 method on UNIX; however, if you are sure that the database
770 files you'll be using for Bayes and AWL storage will never be
771 accessed over NFS, a non-NFS-safe locking system can be
772 selected.
773
774 This will be quite a bit faster, but may risk file corruption
775 if the files are ever accessed by multiple clients at once, and
776 one or more of them is accessing them through an NFS filesys‐
777 tem.
778
779 Note that different platforms require different locking sys‐
780 tems.
781
782 The supported locking systems for "type" are as follows:
783
784 nfssafe - an NFS-safe locking system
785 flock - simple UNIX "flock()" locking
786 win32 - Win32 locking using "sysopen (..., O_CREAT⎪O_EXCL)".
787
788 nfssafe and flock are only available on UNIX, and win32 is only
789 available on Windows. By default, SpamAssassin will choose
790 either nfssafe or win32 depending on the platform in use.
791
792 fold_headers ( 0 ⎪ 1 ) (default: 1)
793 By default, headers added by SpamAssassin will be whitespace
794 folded. In other words, they will be broken up into multiple
795 lines instead of one very long one and each other line will
796 have a tabulator prepended to mark it as a continuation of the
797 preceding one.
798
799 The automatic wrapping can be disabled here. Note that this
800 can generate very long lines.
801
802 report_safe_copy_headers header_name ...
803 If using "report_safe", a few of the headers from the original
804 message are copied into the wrapper header (From, To, Cc, Sub‐
805 ject, Date, etc.) If you want to have other headers copied as
806 well, you can add them using this option. You can specify mul‐
807 tiple headers on the same line, separated by spaces, or you can
808 just use multiple lines.
809
810 envelope_sender_header Name-Of-Header
811 SpamAssassin will attempt to discover the address used in the
812 'MAIL FROM:' phase of the SMTP transaction that delivered this
813 message, if this data has been made available by the SMTP
814 server. This is used in the "EnvelopeFrom" pseudo-header, and
815 for various rules such as SPF checking.
816
817 By default, various MTAs will use different headers, such as
818 the following:
819
820 X-Envelope-From
821 Envelope-Sender
822 X-Sender
823 Return-Path
824
825 SpamAssassin will attempt to use these, if some heuristics
826 (such as the header placement in the message, or the absence of
827 fetchmail signatures) appear to indicate that they are safe to
828 use. However, it may choose the wrong headers in some
829 mailserver configurations. (More discussion of this can be
830 found in bug 2142 and bug 4747 in the SpamAssassin BugZilla.)
831
832 To avoid this heuristic failure, the "envelope_sender_header"
833 setting may be helpful. Name the header that your MTA adds to
834 messages containing the address used at the MAIL FROM step of
835 the SMTP transaction.
836
837 If the header in question contains "<" or ">" characters at the
838 start and end of the email address in the right-hand side, as
839 in the SMTP transaction, these will be stripped.
840
841 If the header is not found in a message, or if it's value does
842 not contain an "@" sign, SpamAssassin will issue a warning in
843 the logs and fall back to its default heuristics.
844
845 (Note for MTA developers: we would prefer if the use of a sin‐
846 gle header be avoided in future, since that precludes 'down‐
847 stream' spam scanning. "http://wiki.apache.org/spamassas‐
848 sin/EnvelopeSenderInReceived" details a better proposal, stor‐
849 ing the envelope sender at each hop in the "Received" header.)
850
851 example:
852
853 envelope_sender_header X-SA-Exim-Mail-From
854
855 describe SYMBOLIC_TEST_NAME description ...
856 Used to describe a test. This text is shown to users in the
857 detailed report.
858
859 Note that test names which begin with '__' are reserved for
860 meta-match sub-rules, and are not scored or listed in the
861 'tests hit' reports.
862
863 Also note that by convention, rule descriptions should be lim‐
864 ited in length to no more than 50 characters.
865
866 report_charset CHARSET (default: unset)
867 Set the MIME Content-Type charset used for the text/plain
868 report which is attached to spam mail messages.
869
870 report ...some text for a report...
871 Set the report template which is attached to spam mail mes‐
872 sages. See the "10_default_prefs.cf" configuration file in
873 "/usr/share/spamassassin" for an example.
874
875 If you change this, try to keep it under 78 columns. Each
876 "report" line appends to the existing template, so use
877 "clear_report_template" to restart.
878
879 Tags can be included as explained above.
880
881 clear_report_template
882 Clear the report template.
883
884 report_contact ...text of contact address...
885 Set what _CONTACTADDRESS_ is replaced with in the above report
886 text. By default, this is 'the administrator of that system',
887 since the hostname of the system the scanner is running on is
888 also included.
889
890 report_hostname ...hostname to use...
891 Set what _HOSTNAME_ is replaced with in the above report text.
892 By default, this is determined dynamically as whatever the host
893 running SpamAssassin calls itself.
894
895 unsafe_report ...some text for a report...
896 Set the report template which is attached to spam mail messages
897 which contain a non-text/plain part. See the
898 "10_default_prefs.cf" configuration file in "/usr/share/spamas‐
899 sassin" for an example.
900
901 Each "unsafe-report" line appends to the existing template, so
902 use "clear_unsafe_report_template" to restart.
903
904 Tags can be used in this template (see above for details).
905
906 clear_unsafe_report_template
907 Clear the unsafe_report template.
908
910 These settings differ from the ones above, in that they are considered
911 'privileged'. Only users running "spamassassin" from their proc‐
912 mailrc's or forward files, or sysadmins editing a file in
913 "/etc/mail/spamassassin", can use them. "spamd" users cannot use them
914 in their "user_prefs" files, for security and efficiency reasons,
915 unless "allow_user_rules" is enabled (and then, they may only add rules
916 from below).
917
918 allow_user_rules ( 0 ⎪ 1 ) (default: 0)
919 This setting allows users to create rules (and only rules) in their
920 "user_prefs" files for use with "spamd". It defaults to off,
921 because this could be a severe security hole. It may be possible
922 for users to gain root level access if "spamd" is run as root. It
923 is NOT a good idea, unless you have some other way of ensuring that
924 users' tests are safe. Don't use this unless you are certain you
925 know what you are doing. Furthermore, this option causes spamassas‐
926 sin to recompile all the tests each time it processes a message for
927 a user with a rule in his/her "user_prefs" file, which could have a
928 significant effect on server load. It is not recommended.
929
930 Note that it is not currently possible to use "allow_user_rules" to
931 modify an existing system rule from a "user_prefs" file with
932 "spamd".
933
934 redirector_pattern /pattern/modifiers
935 A regex pattern that matches both the redirector site portion, and
936 the target site portion of a URI.
937
938 Note: The target URI portion must be surrounded in parentheses and
939 no other part of the pattern may create a backreference.
940
941 Example: http://chkpt.zdnet.com/chkpt/whatever/spam‐
942 mer.domain/yo/dude
943
944 redirector_pattern /^https?:\/\/(?:opt\.)?chkpt\.zdnet\.com\/chkpt\/\w+\/(.*)$/i
945
946 header SYMBOLIC_TEST_NAME header op /pattern/modifiers [if-unset:
947 STRING]
948 Define a test. "SYMBOLIC_TEST_NAME" is a symbolic test name, such
949 as 'FROM_ENDS_IN_NUMS'. "header" is the name of a mail header,
950 such as 'Subject', 'To', etc.
951
952 Appending ":raw" to the header name will inhibit decoding of
953 quoted-printable or base-64 encoded strings.
954
955 Appending ":addr" to the header name will cause everything except
956 the first email address to be removed from the header. For exam‐
957 ple, all of the following will result in "example@foo":
958
959 example@foo
960 example@foo (Foo Blah)
961 example@foo, example@bar
962 display: example@foo (Foo Blah), example@bar ;
963 Foo Blah <example@foo>
964 "Foo Blah" <example@foo>
965 "'Foo Blah'" <example@foo>
966
967 Appending ":name" to the header name will cause everything except
968 the first real name to be removed from the header. For example,
969 all of the following will result in "Foo Blah"
970
971 example@foo (Foo Blah)
972 example@foo (Foo Blah), example@bar
973 display: example@foo (Foo Blah), example@bar ;
974 Foo Blah <example@foo>
975 "Foo Blah" <example@foo>
976 "'Foo Blah'" <example@foo>
977
978 There are several special pseudo-headers that can be specified:
979
980 "ALL" can be used to mean the text of all the message's headers.
981 "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
982 headers.
983 "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
984 SMTP transaction that delivered this message, if this data has been
985 made available by the SMTP server. See "envelope_sender_header"
986 for more information on how to set this.
987 "MESSAGEID" is a symbol meaning all Message-Id's found in the mes‐
988 sage; some mailing list software moves the real 'Message-Id' to
989 'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
990 'Message-Id' header. The value returned for this symbol is the
991 text from all 3 headers, separated by newlines.
992 "X-Spam-Relays-Untrusted", "X-Spam-Relays-Trusted",
993 "X-Spam-Relays-Internal" and "X-Spam-Relays-External" represent a
994 portable, pre-parsed representation of the message's network path,
995 as recorded in the Received headers, divided into 'trusted' vs
996 'untrusted' and 'internal' vs 'external' sets. See
997 "http://wiki.apache.org/spamassassin/TrustedRelays" for more
998 details.
999
1000 "op" is either "=~" (contains regular expression) or "!~" (does not
1001 contain regular expression), and "pattern" is a valid Perl regular
1002 expression, with "modifiers" as regexp modifiers in the usual
1003 style. Note that multi-line rules are not supported, even if you
1004 use "x" as a modifier. Also note that the "#" character must be
1005 escaped ("\#") or else it will be considered to be the start of a
1006 comment and not part of the regexp.
1007
1008 If the "[if-unset: STRING]" tag is present, then "STRING" will be
1009 used if the header is not found in the mail message.
1010
1011 Test names must not start with a number, and must contain only
1012 alphanumerics and underscores. It is suggested that lower-case
1013 characters not be used, and names have a length of no more than 22
1014 characters, as an informal convention. Dashes are not allowed.
1015
1016 Note that test names which begin with '__' are reserved for meta-
1017 match sub-rules, and are not scored or listed in the 'tests hit'
1018 reports. Test names which begin with 'T_' are reserved for tests
1019 which are undergoing QA, and these are given a very low score.
1020
1021 If you add or modify a test, please be sure to run a sanity check
1022 afterwards by running "spamassassin --lint". This will avoid con‐
1023 fusing error messages, or other tests being skipped as a
1024 side-effect.
1025
1026 header SYMBOLIC_TEST_NAME exists:name_of_header
1027 Define a header existence test. "name_of_header" is the name of a
1028 header to test for existence. This is just a very simple version
1029 of the above header tests.
1030
1031 header SYMBOLIC_TEST_NAME eval:name_of_eval_method([arguments])
1032 Define a header eval test. "name_of_eval_method" is the name of a
1033 method on the "Mail::SpamAssassin::EvalTests" object. "arguments"
1034 are optional arguments to the function call.
1035
1036 header SYMBOLIC_TEST_NAME eval:check_rbl('set', 'zone' [, 'sub-test'])
1037 Check a DNSBL (a DNS blacklist or whitelist). This will retrieve
1038 Received: headers from the message, extract the IP addresses,
1039 select which ones are 'untrusted' based on the "trusted_networks"
1040 logic, and query that DNSBL zone. There's a few things to note:
1041
1042 duplicated or private IPs
1043 Duplicated IPs are only queried once and reserved IPs are not
1044 queried. Private IPs are those listed in
1045 <http://www.iana.org/assignments/ipv4-address-space>,
1046 <http://duxcw.com/faq/network/privip.htm>,
1047 <http://duxcw.com/faq/network/autoip.htm>, or
1048 <ftp://ftp.rfc-editor.org/in-notes/rfc3330.txt> as private.
1049
1050 the 'set' argument
1051 This is used as a 'zone ID'. If you want to look up a multi‐
1052 ple-meaning zone like NJABL or SORBS, you can then query the
1053 results from that zone using it; but all check_rbl_sub() calls
1054 must use that zone ID.
1055
1056 Also, if more than one IP address gets a DNSBL hit for a par‐
1057 ticular rule, it does not affect the score because rules only
1058 trigger once per message.
1059
1060 the 'zone' argument
1061 This is the root zone of the DNSBL, ending in a period.
1062
1063 the 'sub-test' argument
1064 This optional argument behaves the same as the sub-test argu‐
1065 ment in "check_rbl_sub()" below.
1066
1067 selecting all IPs except for the originating one
1068 This is accomplished by placing '-notfirsthop' at the end of
1069 the set name. This is useful for querying against DNS lists
1070 which list dialup IP addresses; the first hop may be a dialup,
1071 but as long as there is at least one more hop, via their outgo‐
1072 ing SMTP server, that's legitimate, and so should not gain
1073 points. If there is only one hop, that will be queried anyway,
1074 as it should be relaying via its outgoing SMTP server instead
1075 of sending directly to your MX (mail exchange).
1076
1077 selecting IPs by whether they are trusted
1078 When checking a 'nice' DNSBL (a DNS whitelist), you cannot
1079 trust the IP addresses in Received headers that were not added
1080 by trusted relays. To test the first IP address that can be
1081 trusted, place '-firsttrusted' at the end of the set name.
1082 That should test the IP address of the relay that connected to
1083 the most remote trusted relay.
1084
1085 Note that this requires that SpamAssassin know which relays are
1086 trusted. For simple cases, SpamAssassin can make a good esti‐
1087 mate. For complex cases, you may get better results by setting
1088 "trusted_networks" manually.
1089
1090 In addition, you can test all untrusted IP addresses by placing
1091 '-untrusted' at the end of the set name. Important note --
1092 this does NOT include the IP address from the most recent
1093 'untrusted line', as used in '-firsttrusted' above. That's
1094 because we're talking about the trustworthiness of the IP
1095 address data, not the source header line, here; and in the case
1096 of the most recent header (the 'firsttrusted'), that data can
1097 be trusted. See the Wiki page at "http://wiki.apache.org/spa‐
1098 massassin/TrustedRelays" for more information on this.
1099
1100 Selecting just the last external IP
1101 By using '-lastexternal' at the end of the set name, you can
1102 select only the external host that connected to your internal
1103 network, or at least the last external host with a public IP.
1104
1105 header SYMBOLIC_TEST_NAME eval:check_rbl_txt('set', 'zone')
1106 Same as check_rbl(), except querying using IN TXT instead of IN A
1107 records. If the zone supports it, it will result in a line of text
1108 describing why the IP is listed, typically a hyperlink to a data‐
1109 base entry.
1110
1111 header SYMBOLIC_TEST_NAME eval:check_rbl_sub('set', 'sub-test')
1112 Create a sub-test for 'set'. If you want to look up a multi-mean‐
1113 ing zone like relays.osirusoft.com, you can then query the results
1114 from that zone using the zone ID from the original query. The sub-
1115 test may either be an IPv4 dotted address for RBLs that return mul‐
1116 tiple A records or a non-negative decimal number to specify a bit‐
1117 mask for RBLs that return a single A record containing a bitmask of
1118 results, a SenderBase test beginning with "sb:", or (if none of the
1119 preceding options seem to fit) a regular expression.
1120
1121 Note: the set name must be exactly the same for as the main query
1122 rule, including selections like '-notfirsthop' appearing at the end
1123 of the set name.
1124
1125 body SYMBOLIC_TEST_NAME /pattern/modifiers
1126 Define a body pattern test. "pattern" is a Perl regular expres‐
1127 sion. Note: as per the header tests, "#" must be escaped ("\#") or
1128 else it is considered the beginning of a comment.
1129
1130 The 'body' in this case is the textual parts of the message body;
1131 any non-text MIME parts are stripped, and the message decoded from
1132 Quoted-Printable or Base-64-encoded format if necessary. The mes‐
1133 sage Subject header is considered part of the body and becomes the
1134 first paragraph when running the rules. All HTML tags and line
1135 breaks will be removed before matching.
1136
1137 body SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1138 Define a body eval test. See above.
1139
1140 uri SYMBOLIC_TEST_NAME /pattern/modifiers
1141 Define a uri pattern test. "pattern" is a Perl regular expression.
1142 Note: as per the header tests, "#" must be escaped ("\#") or else
1143 it is considered the beginning of a comment.
1144
1145 The 'uri' in this case is a list of all the URIs in the body of the
1146 email, and the test will be run on each and every one of those
1147 URIs, adjusting the score if a match is found. Use this test
1148 instead of one of the body tests when you need to match a URI, as
1149 it is more accurately bound to the start/end points of the URI, and
1150 will also be faster.
1151
1152 rawbody SYMBOLIC_TEST_NAME /pattern/modifiers
1153 Define a raw-body pattern test. "pattern" is a Perl regular
1154 expression. Note: as per the header tests, "#" must be escaped
1155 ("\#") or else it is considered the beginning of a comment.
1156
1157 The 'raw body' of a message is the raw data inside all textual
1158 parts. The text will be decoded from base64 or quoted-printable
1159 encoding, but HTML tags and line breaks will still be present.
1160 The pattern will be applied line-by-line.
1161
1162 rawbody SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1163 Define a raw-body eval test. See above.
1164
1165 full SYMBOLIC_TEST_NAME /pattern/modifiers
1166 Define a full message pattern test. "pattern" is a Perl regular
1167 expression. Note: as per the header tests, "#" must be escaped
1168 ("\#") or else it is considered the beginning of a comment.
1169
1170 The full message is the pristine message headers plus the pristine
1171 message body, including all MIME data such as images, other attach‐
1172 ments, MIME boundaries, etc.
1173
1174 full SYMBOLIC_TEST_NAME eval:name_of_eval_method([args])
1175 Define a full message eval test. See above.
1176
1177 meta SYMBOLIC_TEST_NAME boolean expression
1178 Define a boolean expression test in terms of other tests that have
1179 been hit or not hit. For example:
1180
1181 meta META1 TEST1 && !(TEST2 ⎪⎪ TEST3)
1182
1183 Note that English language operators ("and", "or") will be treated
1184 as rule names, and that there is no "XOR" operator.
1185
1186 meta SYMBOLIC_TEST_NAME boolean arithmetic expression
1187 Can also define a boolean arithmetic expression in terms of other
1188 tests, with an unhit test having the value "0" and a hit test hav‐
1189 ing a nonzero value. The value of a hit meta test is that of its
1190 arithmetic expression. The value of a hit eval test is that
1191 returned by its method. The value of a hit header, body, rawbody,
1192 uri, or full test which has the "multiple" tflag is the number of
1193 times the test hit. The value of any other type of hit test is
1194 "1".
1195
1196 For example:
1197
1198 meta META2 (3 * TEST1 - 2 * TEST2) > 0
1199
1200 Note that Perl builtins and functions, like "abs()", can't be used,
1201 and will be treated as rule names.
1202
1203 If you want to define a meta-rule, but do not want its individual
1204 sub-rules to count towards the final score unless the entire meta-
1205 rule matches, give the sub-rules names that start with '__' (two
1206 underscores). SpamAssassin will ignore these for scoring.
1207
1208 tflags SYMBOLIC_TEST_NAME [ {net⎪nice⎪learn⎪userconf⎪noautolearn⎪multi‐
1209 ple} ]
1210 Used to set flags on a test. These flags are used in the score-
1211 determination back end system for details of the test's behaviour.
1212 Please see "bayes_auto_learn" for more information about tflag
1213 interaction with those systems. The following flags can be set:
1214
1215 net The test is a network test, and will not be run in the mass
1216 checking system or if -L is used, therefore its score should
1217 not be modified.
1218
1219 nice
1220 The test is intended to compensate for common false positives,
1221 and should be assigned a negative score.
1222
1223 userconf
1224 The test requires user configuration before it can be used
1225 (like language- specific tests).
1226
1227 learn
1228 The test requires training before it can be used.
1229
1230 noautolearn
1231 The test will explicitly be ignored when calculating the score
1232 for learning systems.
1233
1234 multiple
1235 The test will be evaluated multiple times, for use with meta
1236 rules. Only affects header, body, rawbody, uri, and full
1237 tests.
1238
1239 priority SYMBOLIC_TEST_NAME n
1240 Assign a specific priority to a test. All tests, except for DNS
1241 and Meta tests, are run in increasing priority value order (nega‐
1242 tive priority values are run before positive priority values). The
1243 default test priority is 0 (zero).
1244
1245 The values <-99999999999999> and <-99999999999998> have a special
1246 meaning internally, and should not be used.
1247
1249 These settings differ from the ones above, in that they are considered
1250 'more privileged' -- even more than the ones in the PRIVILEGED SETTINGS
1251 section. No matter what "allow_user_rules" is set to, these can never
1252 be set from a user's "user_prefs" file when spamc/spamd is being used.
1253 However, all settings can be used by local programs run directly by the
1254 user.
1255
1256 version_tag string
1257 This tag is appended to the SA version in the X-Spam-Status header.
1258 You should include it when modify your ruleset, especially if you
1259 plan to distribute it. A good choice for string is your last name
1260 or your initials followed by a number which you increase with each
1261 change.
1262
1263 The version_tag will be lowercased, and any non-alphanumeric or
1264 period character will be replaced by an underscore.
1265
1266 e.g.
1267
1268 version_tag myrules1 # version=2.41-myrules1
1269
1270 test SYMBOLIC_TEST_NAME (ok⎪fail) Some string to test against
1271 Define a regression testing string. You can have more than one
1272 regression test string per symbolic test name. Simply specify a
1273 string that you wish the test to match.
1274
1275 These tests are only run as part of the test suite - they should
1276 not affect the general running of SpamAssassin.
1277
1278 rbl_timeout t [t_min] [zone] (default: 15 3)
1279 All DNS queries are made at the beginning of a check and we try to
1280 read the results at the end. This value specifies the maximum
1281 period of time (in seconds) to wait for an DNS query. If most of
1282 the DNS queries have succeeded for a particular message, then Spa‐
1283 mAssassin will not wait for the full period to avoid wasting time
1284 on unresponsive server(s), but will shrink the timeout according to
1285 a percentage of queries already completed. As the number of
1286 queries remaining approaches 0, the timeout value will gradually
1287 approach a t_min value, which is an optional second parameter and
1288 defaults to 0.2 * t. If t is smaller than t_min, the initial time‐
1289 out is set to t_min. Here is a chart of queries remaining versus
1290 the timeout in seconds, for the default 15 second / 3 second time‐
1291 out setting:
1292
1293 queries left 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
1294 timeout 15 14.9 14.5 13.9 13.1 12.0 10.7 9.1 7.3 5.3 3
1295
1296 For example, if 20 queries are made at the beginning of a message
1297 check and 16 queries have returned (leaving 20%), the remaining 4
1298 queries should finish within 7.3 seconds since their query started
1299 or they will be timed out. Note that timed out queries are only
1300 aborted when there is nothing else left for SpamAssassin to do -
1301 long evaluation of other rules may grant queries additional time.
1302
1303 If a parameter 'zone' is specified (it must end with a letter,
1304 which distinguishes it from other numeric parametrs), then the set‐
1305 ting only applies to DNS queries against the specified DNS domain
1306 (host, domain or RBL (sub)zone). Matching is case-insensitive, the
1307 actual domain may be a subdomain of the specified zone.
1308
1309 util_rb_tld tld1 tld2 ...
1310 This option allows the addition of new TLDs to the RegistrarBound‐
1311 aries code. Updates to the list usually happen when new versions
1312 of SpamAssassin are released, but sometimes it's necessary to add
1313 in new TLDs faster than a release can occur. TLDs include things
1314 like com, net, org, etc.
1315
1316 util_rb_2tld 2tld-1.tld 2tld-2.tld ...
1317 This option allows the addition of new 2nd-level TLDs (2TLD) to the
1318 RegistrarBoundaries code. Updates to the list usually happen when
1319 new versions of SpamAssassin are released, but sometimes it's nec‐
1320 essary to add in new 2TLDs faster than a release can occur. 2TLDs
1321 include things like co.uk, fed.us, etc.
1322
1323 bayes_path /path/filename (default: ~/.spamassassin/bayes)
1324 This is the directory and filename for Bayes databases. Several
1325 databases will be created, with this as the base directory and
1326 filename, with "_toks", "_seen", etc. appended to the base. The
1327 default setting results in files called "~/.spamassas‐
1328 sin/bayes_seen", "~/.spamassassin/bayes_toks", etc.
1329
1330 By default, each user has their own in their "~/.spamassassin"
1331 directory with mode 0700/0600. For system-wide SpamAssassin use,
1332 you may want to reduce disk space usage by sharing this across all
1333 users. However, Bayes appears to be more effective with individual
1334 user databases.
1335
1336 bayes_file_mode (default: 0700)
1337 The file mode bits used for the Bayesian filtering database files.
1338
1339 Make sure you specify this using the 'x' mode bits set, as it may
1340 also be used to create directories. However, if a file is created,
1341 the resulting file will not have any execute bits set (the umask is
1342 set to 111).
1343
1344 bayes_store_module Name::Of::BayesStore::Module
1345 If this option is set, the module given will be used as an alter‐
1346 nate to the default bayes storage mechanism. It must conform to
1347 the published storage specification (see Mail::SpamAssas‐
1348 sin::BayesStore). For example, set this to Mail::SpamAssas‐
1349 sin::BayesStore::SQL to use the generic SQL storage module.
1350
1351 bayes_sql_dsn DBI::databasetype:databasename:hostname:port
1352 Used for BayesStore::SQL storage implementation.
1353
1354 This option give the connect string used to connect to the SQL
1355 based Bayes storage.
1356
1357 bayes_sql_username
1358 Used by BayesStore::SQL storage implementation.
1359
1360 This option gives the username used by the above DSN.
1361
1362 bayes_sql_password
1363 Used by BayesStore::SQL storage implementation.
1364
1365 This option gives the password used by the above DSN.
1366
1367 bayes_sql_username_authorized ( 0 ⎪ 1 ) (default: 0)
1368 Whether to call the services_authorized_for_username plugin hook in
1369 BayesSQL. If the hook does not determine that the user is allowed
1370 to use bayes or is invalid then then database will not be initial‐
1371 ized.
1372
1373 NOTE: By default the user is considered invalid until a plugin
1374 returns a true value. If you enable this, but do not have a proper
1375 plugin loaded, all users will turn up as invalid.
1376
1377 The username passed into the plugin can be affected by the
1378 bayes_sql_override_username config option.
1379
1380 user_scores_dsn DBI:databasetype:databasename:hostname:port
1381 If you load user scores from an SQL database, this will set the DSN
1382 used to connect. Example: "DBI:mysql:spamassassin:localhost"
1383
1384 If you load user scores from an LDAP directory, this will set the
1385 DSN used to connect. You have to write the DSN as an LDAP URL, the
1386 components being the host and port to connect to, the base DN for
1387 the seasrch, the scope of the search (base, one or sub), the single
1388 attribute being the multivalued attribute used to hold the configu‐
1389 ration data (space separated pairs of key and value, just as in a
1390 file) and finally the filter being the expression used to filter
1391 out the wanted username. Note that the filter expression is being
1392 used in a sprintf statement with the username as the only parame‐
1393 ter, thus is can hold a single __USERNAME__ expression. This will
1394 be replaced with the username.
1395
1396 Example: "ldap://localhost:389/dc=koehntopp,dc=de?spamassassincon‐
1397 fig?uid=__USERNAME__"
1398
1399 user_scores_sql_username username
1400 The authorized username to connect to the above DSN.
1401
1402 user_scores_sql_password password
1403 The password for the database username, for the above DSN.
1404
1405 user_scores_sql_custom_query query
1406 This option gives you the ability to create a custom SQL query to
1407 retrieve user scores and preferences. In order to work correctly
1408 your query should return two values, the preference name and value,
1409 in that order. In addition, there are several "variables" that you
1410 can use as part of your query, these variables will be substituted
1411 for the current values right before the query is run. The current
1412 allowed variables are:
1413
1414 _TABLE_
1415 The name of the table where user scores and preferences are
1416 stored. Currently hardcoded to userpref, to change this value
1417 you need to create a new custom query with the new table name.
1418
1419 _USERNAME_
1420 The current user's username.
1421
1422 _MAILBOX_
1423 The portion before the @ as derived from the current user's
1424 username.
1425
1426 _DOMAIN_
1427 The portion after the @ as derived from the current user's
1428 username, this value may be null.
1429
1430 The query must be one one continuous line in order to parse cor‐
1431 rectly.
1432
1433 Here are several example queries, please note that these are broken
1434 up for easy reading, in your config it should be one continuous
1435 line.
1436
1437 Current default query:
1438 "SELECT preference, value FROM _TABLE_ WHERE username = _USER‐
1439 NAME_ OR username = '@GLOBAL' ORDER BY username ASC"
1440
1441 Use global and then domain level defaults:
1442 "SELECT preference, value FROM _TABLE_ WHERE username = _USER‐
1443 NAME_ OR username = '@GLOBAL' OR username = '@~'⎪⎪_DOMAIN_
1444 ORDER BY username ASC"
1445
1446 Maybe global prefs should override user prefs:
1447 "SELECT preference, value FROM _TABLE_ WHERE username = _USER‐
1448 NAME_ OR username = '@GLOBAL' ORDER BY username DESC"
1449
1450 user_scores_ldap_username
1451 This is the Bind DN used to connect to the LDAP server. It
1452 defaults to the empty string (""), allowing anonymous binding to
1453 work.
1454
1455 Example: "cn=master,dc=koehntopp,dc=de"
1456
1457 user_scores_ldap_password
1458 This is the password used to connect to the LDAP server. It
1459 defaults to the empty string ("").
1460
1461 loadplugin PluginModuleName [/path/module.pm]
1462 Load a SpamAssassin plugin module. The "PluginModuleName" is the
1463 perl module name, used to create the plugin object itself.
1464
1465 "/path/to/module.pm" is the file to load, containing the module's
1466 perl code; if it's specified as a relative path, it's considered to
1467 be relative to the current configuration file. If it is omitted,
1468 the module will be loaded using perl's search path (the @INC
1469 array).
1470
1471 See "Mail::SpamAssassin::Plugin" for more details on writing plug‐
1472 ins.
1473
1474 tryplugin PluginModuleName [/path/module.pm]
1475 Same as "loadplugin", but silently ignored if the .pm file cannot
1476 be found in the filesystem.
1477
1479 include filename
1480 Include configuration lines from "filename". Relative paths are
1481 considered relative to the current configuration file or user pref‐
1482 erences file.
1483
1484 if (conditional perl expression)
1485 Used to support conditional interpretation of the configuration
1486 file. Lines between this and a corresponding "else" or "endif"
1487 line, will be ignored unless the conditional expression evaluates
1488 as true (in the perl sense; that is, defined and non-0).
1489
1490 The conditional accepts a limited subset of perl for security --
1491 just enough to perform basic arithmetic comparisons. The following
1492 input is accepted:
1493
1494 numbers, whitespace, arithmetic operations and grouping
1495 Namely these characters and ranges:
1496
1497 ( ) - + * / _ . , < = > ! ~ 0-9 whitespace
1498
1499 version
1500 This will be replaced with the version number of the currently-
1501 running SpamAssassin engine. Note: The version used is in the
1502 internal SpamAssassin version format which is "x.yyyzzz", where
1503 x is major version, y is minor version, and z is maintenance
1504 version. So 3.0.0 is 3.000000, and 3.4.80 is 3.004080.
1505
1506 plugin(Name::Of::Plugin)
1507 This is a function call that returns 1 if the plugin named
1508 "Name::Of::Plugin" is loaded, or "undef" otherwise.
1509
1510 If the end of a configuration file is reached while still inside a
1511 "if" scope, a warning will be issued, but parsing will restart on
1512 the next file.
1513
1514 For example:
1515
1516 if (version > 3.000000)
1517 header MY_FOO ...
1518 endif
1519
1520 loadplugin MyPlugin plugintest.pm
1521
1522 if plugin (MyPlugin)
1523 header MY_PLUGIN_FOO eval:check_for_foo()
1524 score MY_PLUGIN_FOO 0.1
1525 endif
1526
1527 ifplugin PluginModuleName
1528 An alias for "if plugin(PluginModuleName)".
1529
1530 else
1531 Used to support conditional interpretation of the configuration
1532 file. Lines between this and a corresponding "endif" line, will be
1533 ignored unless the conditional expression evaluates as false (in
1534 the perl sense; that is, not defined and 0).
1535
1536 require_version n.nnnnnn
1537 Indicates that the entire file, from this line on, requires a cer‐
1538 tain version of SpamAssassin to run. If a different (older or
1539 newer) version of SpamAssassin tries to read the configuration from
1540 this file, it will output a warning instead, and ignore it.
1541
1542 Note: The version used is in the internal SpamAssassin version for‐
1543 mat which is "x.yyyzzz", where x is major version, y is minor ver‐
1544 sion, and z is maintenance version. So 3.0.0 is 3.000000, and
1545 3.4.80 is 3.004080.
1546
1548 The following "tags" can be used as placeholders in certain options.
1549 They will be replaced by the corresponding value when they are used.
1550
1551 Some tags can take an argument (in parentheses). The argument is
1552 optional, and the default is shown below.
1553
1554 _YESNOCAPS_ "YES"/"NO" for is/isn't spam
1555 _YESNO_ "Yes"/"No" for is/isn't spam
1556 _SCORE(PAD)_ message score, if PAD is included and is either spaces or
1557 zeroes, then pad scores with that many spaces or zeroes
1558 (default, none) ie: _SCORE(0)_ makes 2.4 become 02.4,
1559 _SCORE(00)_ is 002.4. 12.3 would be 12.3 and 012.3
1560 respectively.
1561 _REQD_ message threshold
1562 _VERSION_ version (eg. 3.0.0 or 3.1.0-r26142-foo1)
1563 _SUBVERSION_ sub-version/code revision date (eg. 2004-01-10)
1564 _HOSTNAME_ hostname of the machine the mail was processed on
1565 _REMOTEHOSTNAME_ hostname of the machine the mail was sent from, only
1566 available with spamd
1567 _REMOTEHOSTADDR_ ip address of the machine the mail was sent from, only
1568 available with spamd
1569 _BAYES_ bayes score
1570 _TOKENSUMMARY_ number of new, neutral, spammy, and hammy tokens found
1571 _BAYESTC_ number of new tokens found
1572 _BAYESTCLEARNED_ number of seen tokens found
1573 _BAYESTCSPAMMY_ number of spammy tokens found
1574 _BAYESTCHAMMY_ number of hammy tokens found
1575 _HAMMYTOKENS(N)_ the N most significant hammy tokens (default, 5)
1576 _SPAMMYTOKENS(N)_ the N most significant spammy tokens (default, 5)
1577 _DATE_ rfc-2822 date of scan
1578 _STARS(*)_ one "*" (use any character) for each full score point
1579 (note: limited to 50 'stars')
1580 _RELAYSTRUSTED_ relays used and deemed to be trusted (see the
1581 'X-Spam-Relays-Trusted' pseudo-header)
1582 _RELAYSUNTRUSTED_ relays used that can not be trusted (see the
1583 'X-Spam-Relays-Untrusted' pseudo-header)
1584 _RELAYSINTERNAL_ relays used and deemed to be internal (see the
1585 'X-Spam-Relays-Internal' pseudo-header)
1586 _RELAYSEXTERNAL_ relays used and deemed to be external (see the
1587 'X-Spam-Relays-External' pseudo-header)
1588 _LASTEXTERNALIP_ IP address of client in the external-to-internal
1589 SMTP handover
1590 _LASTEXTERNALRDNS_ reverse-DNS of client in the external-to-internal
1591 SMTP handover
1592 _LASTEXTERNALHELO_ HELO string used by client in the external-to-internal
1593 SMTP handover
1594 _AUTOLEARN_ autolearn status ("ham", "no", "spam", "disabled",
1595 "failed", "unavailable")
1596 _AUTOLEARNSCORE_ portion of message score used by autolearn
1597 _TESTS(,)_ tests hit separated by "," (or other separator)
1598 _TESTSSCORES(,)_ as above, except with scores appended (eg. AWL=-3.0,...)
1599 _SUBTESTS(,)_ subtests (start with "__") hit separated by ","
1600 (or other separator)
1601 _DCCB_ DCC's "Brand"
1602 _DCCR_ DCC's results
1603 _PYZOR_ Pyzor results
1604 _RBL_ full results for positive RBL queries in DNS URI format
1605 _LANGUAGES_ possible languages of mail
1606 _PREVIEW_ content preview
1607 _REPORT_ terse report of tests hit (for header reports)
1608 _SUMMARY_ summary of tests hit for standard report (for body reports)
1609 _CONTACTADDRESS_ contents of the 'report_contact' setting
1610 _HEADER(NAME)_ includes the value of a message header. value is the same
1611 as is found for header rules (see elsewhere in this doc)
1612
1613 If a tag reference uses the name of a tag which is not in this list or
1614 defined by a loaded plugin, the reference will be left intact and not
1615 replaced by any value.
1616
1617 The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second argu‐
1618 ment which specifies a format. See the HAMMYTOKENS/SPAMMYTOKENS TAG
1619 FORMAT section, below, for details.
1620
1621 HAMMYTOKENS/SPAMMYTOKENS TAG FORMAT
1622
1623 The "HAMMYTOKENS" and "SPAMMYTOKENS" tags have an optional second argu‐
1624 ment which specifies a format: "_SPAMMYTOKENS(N,FMT)_", "_HAMMYTO‐
1625 KENS(N,FMT)_" The following formats are available:
1626
1627 short
1628 Only the tokens themselves are listed. For example, preference
1629 file entry:
1630
1631 "add_header all Spammy _SPAMMYTOKENS(2,short)_"
1632
1633 Results in message header:
1634
1635 "X-Spam-Spammy: remove.php, UD:jpg"
1636
1637 Indicating that the top two spammy tokens found are "remove.php"
1638 and "UD:jpg". (The token itself follows the last colon, the text
1639 before the colon indicates something about the token. "UD" means
1640 the token looks like it might be part of a domain name.)
1641
1642 compact
1643 The token probability, an abbreviated declassification distance
1644 (see example), and the token are listed. For example, preference
1645 file entry:
1646
1647 "add_header all Spammy _SPAMMYTOKENS(2,compact)_"
1648
1649 Results in message header:
1650
1651 "0.989-6--remove.php, 0.988-+--UD:jpg"
1652
1653 Indicating that the probabilities of the top two tokens are 0.989
1654 and 0.988, respectively. The first token has a declassification
1655 distance of 6, meaning that if the token had appeared in at least 6
1656 more ham messages it would not be considered spammy. The "+" for
1657 the second token indicates a declassification distance greater than
1658 9.
1659
1660 long
1661 Probability, declassification distance, number of times seen in a
1662 ham message, number of times seen in a spam message, age and the
1663 token are listed.
1664
1665 For example, preference file entry:
1666
1667 "add_header all Spammy _SPAMMYTOKENS(2,long)_"
1668
1669 Results in message header:
1670
1671 "X-Spam-Spammy: 0.989-6--0h-4s--4d--remove.php,
1672 0.988-33--2h-25s--1d--UD:jpg"
1673
1674 In addition to the information provided by the compact option, the
1675 long option shows that the first token appeared in zero ham mes‐
1676 sages and four spam messages, and that it was last seen four days
1677 ago. The second token appeared in two ham messages, 25 spam mes‐
1678 sages and was last seen one day ago. (Unlike the "compact" option,
1679 the long option shows declassification distances that are greater
1680 than 9.)
1681
1683 A line starting with the text "lang xx" will only be interpreted if the
1684 user is in that locale, allowing test descriptions and templates to be
1685 set for that language.
1686
1687 The locales string should specify either both the language and country,
1688 e.g. "lang pt_BR", or just the language, e.g. "lang de".
1689
1691 "Mail::SpamAssassin" "spamassassin" "spamd"
1692
1693
1694
1695perl v5.8.8 2008-01-05 Mail::SpamAssassin::Conf(3)