1Mail::SpamAssassin(3) User Contributed Perl DocumentationMail::SpamAssassin(3)
2
3
4

NAME

6       Mail::SpamAssassin - Spam detector and markup engine
7

SYNOPSIS

9         my $spamtest = Mail::SpamAssassin->new();
10         my $mail = $spamtest->parse($message);
11         my $status = $spamtest->check($mail);
12
13         if ($status->is_spam()) {
14           $message = $status->rewrite_mail();
15         }
16         else {
17           ...
18         }
19         ...
20
21         $status->finish();
22         $mail->finish();
23

DESCRIPTION

25       Mail::SpamAssassin is a module to identify spam using several methods
26       including text analysis, internet-based realtime blacklists, statisti‐
27       cal analysis, and internet-based hashing algorithms.
28
29       Using its rule base, it uses a wide range of heuristic tests on mail
30       headers and body text to identify "spam", also known as unsolicited
31       bulk email.  Once identified as spam, the mail can then be tagged as
32       spam for later filtering using the user's own mail user agent applica‐
33       tion or at the mail transfer agent.
34
35       If you wish to use a command-line filter tool, try the "spamassassin"
36       or the "spamd"/"spamc" tools provided.
37

METHODS

39       $t = Mail::SpamAssassin->new( { opt => val, ... } )
40           Constructs a new "Mail::SpamAssassin" object.  You may pass a hash
41           reference to the constructor which may contain the following
42           attribute- value pairs.
43
44           debug
45               This is the debug options used to determine logging level.  It
46               exists to allow sections of debug messages (called "facili‐
47               ties") to be enabled or disabled.  If this is a string, it is
48               treated as a comma-delimited list of the debug facilities.  If
49               it's a hash reference, then the keys are treated as the list of
50               debug facilities and if it's a array reference, then the ele‐
51               ments are treated as the list of debug facilities.
52
53               There are also two special cases: (1) if the special case of
54               "info" is passed as a debug facility, then all informational
55               messages are enabled; (2) if the special case of "all" is
56               passed as a debug facility, then all debugging facilities are
57               enabled.
58
59           rules_filename
60               The filename/directory to load spam-identifying rules from.
61               (optional)
62
63           site_rules_filename
64               The directory to load site-specific spam-identifying rules
65               from. (optional)
66
67           userprefs_filename
68               The filename to load preferences from. (optional)
69
70           userstate_dir
71               The directory user state is stored in. (optional)
72
73           config_tree_recurse
74               Set to 1 to recurse through directories when reading configura‐
75               tion files, instead of just reading a single level.  (optional,
76               default 0)
77
78           config_text
79               The text of all rules and preferences.  If you prefer not to
80               load the rules from files, read them in yourself and set this
81               instead.  As a result, this will override the settings for
82               "rules_filename", "site_rules_filename", and "userprefs_file‐
83               name".
84
85           post_config_text
86               Similar to "config_text", this text is placed after config_text
87               to allow an override of config files.
88
89           force_ipv4
90               If set to 1, DNS tests will not attempt to use IPv6. Use if the
91               existing tests for IPv6 availablity produce incorrect results
92               or crashes.
93
94           languages_filename
95               If you want to be able to use the language-guessing rule
96               "UNWANTED_LANGUAGE_BODY", and are using "config_text" instead
97               of "rules_filename", "site_rules_filename", and "user‐
98               prefs_filename", you will need to set this.  It should be the
99               path to the languages file normally found in the SpamAssassin
100               rules directory.
101
102           local_tests_only
103               If set to 1, no tests that require internet access will be per‐
104               formed. (default: 0)
105
106           ignore_site_cf_files
107               If set to 1, any rule files found in the "site_rules_filename"
108               directory will be ignored.  *.pre files (used for loading plug‐
109               ins) found in the "site_rules_filename" directory will still be
110               used. (default: 0)
111
112           dont_copy_prefs
113               If set to 1, the user preferences file will not be created if
114               it doesn't already exist. (default: 0)
115
116           save_pattern_hits
117               If set to 1, the patterns hit can be retrieved from the
118               "Mail::SpamAssassin::PerMsgStatus" object.  Used for debugging.
119
120           home_dir_for_helpers
121               If set, the HOME environment variable will be set to this value
122               when using test applications that require their configuration
123               data, such as Razor, Pyzor and DCC.
124
125           username
126               If set, the "username" attribute will use this as the current
127               user's name.  Otherwise, the default is taken from the runtime
128               environment (ie. this process' effective UID under UNIX).
129
130           If none of "rules_filename", "site_rules_filename", "user‐
131           prefs_filename", or "config_text" is set, the "Mail::SpamAssassin"
132           module will search for the configuration files in the usual
133           installed locations using the below variable definitions which can
134           be passed in.
135
136           PREFIX
137               Used as the root for certain directory paths such as:
138
139                 '__prefix__/etc/mail/spamassassin'
140                 '__prefix__/etc/spamassassin'
141
142               Defaults to "@@PREFIX@@".
143
144           DEF_RULES_DIR
145               Location where the default rules are installed.  Defaults to
146               "@@DEF_RULES_DIR@@".
147
148           LOCAL_RULES_DIR
149               Location where the local site rules are installed.  Defaults to
150               "@@LOCAL_RULES_DIR@@".
151
152           LOCAL_STATE_DIR
153               Location of the local state directory, mainly used for
154               installing updates via "sa-update" and compiling rulesets to
155               native code.  Defaults to "@@LOCAL_STATE_DIR@@".
156
157       parse($message, $parse_now)
158           Parse will return a Mail::SpamAssassin::Message object with just
159           the headers parsed.  When calling this function, there are two
160           optional parameters that can be passed in: $message is either undef
161           (which will use STDIN), a scalar of the entire message, an array
162           reference of the message with 1 line per array element, or a file
163           glob which holds the entire contents of the message; and
164           $parse_now, which specifies whether or not to create the MIME tree
165           at parse time or later as necessary.
166
167           The $parse_now option, by default, is set to false (0).  This
168           allows SpamAssassin to not have to generate the tree of internal
169           data nodes if the information is not going to be used.  This is
170           handy, for instance, when running "spamassassin -d", which only
171           needs the pristine header and body which is always parsed and
172           stored by this function.
173
174           For more information, please see the "Mail::SpamAssassin::Message"
175           and "Mail::SpamAssassin::Message::Node" POD.
176
177       $status = $f->check ($mail)
178           Check a mail, encapsulated in a "Mail::SpamAssassin::Message"
179           object, to determine if it is spam or not.
180
181           Returns a "Mail::SpamAssassin::PerMsgStatus" object which can be
182           used to test or manipulate the mail message.
183
184           Note that the "Mail::SpamAssassin" object can be re-used for fur‐
185           ther messages without affecting this check; in OO terminology, the
186           "Mail::SpamAssassin" object is a "factory".   However, if you do
187           this, be sure to call the "finish()" method on the status objects
188           when you're done with them.
189
190       $status = $f->check_message_text ($mailtext)
191           Check a mail, encapsulated in a plain string $mailtext, to deter‐
192           mine if it is spam or not.
193
194           Otherwise identical to "check()" above.
195
196       $status = $f->learn ($mail, $id, $isspam, $forget)
197           Learn from a mail, encapsulated in a "Mail::SpamAssassin::Message"
198           object.
199
200           If $isspam is set, the mail is assumed to be spam, otherwise it
201           will be learnt as non-spam.
202
203           If $forget is set, the attributes of the mail will be removed from
204           both the non-spam and spam learning databases.
205
206           $id is an optional message-identification string, used internally
207           to tag the message.  If it is "undef", the Message-Id of the mes‐
208           sage will be used.  It should be unique to that message.
209
210           Returns a "Mail::SpamAssassin::PerMsgLearner" object which can be
211           used to manipulate the learning process for each mail.
212
213           Note that the "Mail::SpamAssassin" object can be re-used for fur‐
214           ther messages without affecting this check; in OO terminology, the
215           "Mail::SpamAssassin" object is a "factory".   However, if you do
216           this, be sure to call the "finish()" method on the learner objects
217           when you're done with them.
218
219           "learn()" and "check()" can be run using the same factory.
220           "init_learner()" must be called before using this method.
221
222       $f->init_learner ( [ { opt => val, ... } ] )
223           Initialise learning.  You may pass the following attribute-value
224           pairs to this method.
225
226           caller_will_untie
227               Whether or not the code calling this method will take care of
228               untie'ing from the Bayes databases (by calling "fin‐
229               ish_learner()") (optional, default 0).
230
231           force_expire
232               Should an expiration run be forced to occur immediately?
233               (optional, default 0).
234
235           learn_to_journal
236               Should learning data be written to the journal, instead of
237               directly to the databases? (optional, default 0).
238
239           wait_for_lock
240               Whether or not to wait a long time for locks to complete
241               (optional, default 0).
242
243           opportunistic_expire_check_only
244               During the opportunistic journal sync and expire check, don't
245               actually do the expire but report back whether or not it should
246               occur (optional, default 0).
247
248           no_relearn
249               If doing a learn operation, and the message has already been
250               learned as the opposite type, don't re-learn the message.
251
252       $f->rebuild_learner_caches ({ opt => val })
253           Rebuild any cache databases; should be called after the learning
254           process.  Options include: "verbose", which will output diagnostics
255           to "stdout" if set to 1.
256
257       $f->finish_learner ()
258           Finish learning.
259
260       $f->dump_bayes_db()
261           Dump the contents of the Bayes DB
262
263       $f->signal_user_changed ( [ { opt => val, ... } ] )
264           Signals that the current user has changed (possibly using
265           "setuid"), meaning that SpamAssassin should close any per-user
266           databases it has open, and re-open using ones appropriate for the
267           new user.
268
269           Note that this should be called after reading any per-user configu‐
270           ration, as that data may override some paths opened in this method.
271           You may pass the following attribute-value pairs:
272
273           username
274               The username of the user.  This will be used for the "username"
275               attribute.
276
277           user_dir
278               A directory to use as a 'home directory' for the current user's
279               data, overriding the system default.  This directory must be
280               readable and writable by the process.  Note that the resulting
281               "userstate_dir" will be the ".spamassassin" subdirectory of
282               this dir.
283
284           userstate_dir
285               A directory to use as a directory for the current user's data,
286               overriding the system default.  This directory must be readable
287               and writable by the process.  The default is "user_dir/.spamas‐
288               sassin".
289
290       $f->report_as_spam ($mail, $options)
291           Report a mail, encapsulated in a "Mail::SpamAssassin::Message"
292           object, as human-verified spam.  This will submit the mail message
293           to live, collaborative, spam-blocker databases, allowing other
294           users to block this message.
295
296           It will also submit the mail to SpamAssassin's Bayesian learner.
297
298           Options is an optional reference to a hash of options.  Currently
299           these can be:
300
301           dont_report_to_dcc
302               Inhibits reporting of the spam to DCC.
303
304           dont_report_to_pyzor
305               Inhibits reporting of the spam to Pyzor.
306
307           dont_report_to_razor
308               Inhibits reporting of the spam to Razor.
309
310           dont_report_to_spamcop
311               Inhibits reporting of the spam to SpamCop.
312
313       $f->revoke_as_spam ($mail, $options)
314           Revoke a mail, encapsulated in a "Mail::SpamAssassin::Message"
315           object, as human-verified ham (non-spam).  This will revoke the
316           mail message from live, collaborative, spam-blocker databases,
317           allowing other users to block this message.
318
319           It will also submit the mail to SpamAssassin's Bayesian learner as
320           nonspam.
321
322           Options is an optional reference to a hash of options.  Currently
323           these can be:
324
325           dont_report_to_razor
326               Inhibits revoking of the spam to Razor.
327
328       $f->add_address_to_whitelist ($addr)
329           Given a string containing an email address, add it to the automatic
330           whitelist database.
331
332       $f->add_all_addresses_to_whitelist ($mail)
333           Given a mail message, find as many addresses in the usual headers
334           (To, Cc, From etc.), and the message body, and add them to the
335           automatic whitelist database.
336
337       $f->remove_address_from_whitelist ($addr)
338           Given a string containing an email address, remove it from the
339           automatic whitelist database.
340
341       $f->remove_all_addresses_from_whitelist ($mail)
342           Given a mail message, find as many addresses in the usual headers
343           (To, Cc, From etc.), and the message body, and remove them from the
344           automatic whitelist database.
345
346       $f->add_address_to_blacklist ($addr)
347           Given a string containing an email address, add it to the automatic
348           whitelist database with a high score, effectively blacklisting
349           them.
350
351       $f->add_all_addresses_to_blacklist ($mail)
352           Given a mail message, find addresses in the From headers and add
353           them to the automatic whitelist database with a high score, effec‐
354           tively blacklisting them.
355
356           Note that To and Cc addresses are not used.
357
358       $text = $f->remove_spamassassin_markup ($mail)
359           Returns the text of the message, with any SpamAssassin-added text
360           (such as the report, or X-Spam-Status headers) stripped.
361
362           Note that the $mail object is not modified.
363
364           Warning: if the input message in $mail contains a mixture of CR-LF
365           (Windows-style) and LF (UNIX-style) line endings, it will be
366           "canonicalized" to use one or the other consistently throughout.
367
368       $f->read_scoreonly_config ($filename)
369           Read a configuration file and parse user preferences from it.
370
371           User preferences are as defined in the "Mail::SpamAssassin::Conf"
372           manual page.  In other words, they include scoring options, scores,
373           whitelists and blacklists, and so on, but do not include rule defi‐
374           nitions, privileged settings, etc. unless "allow_user_rules" is
375           enabled; and they never include the administrator settings.
376
377       $f->load_scoreonly_sql ($username)
378           Read configuration paramaters from SQL database and parse scores
379           from it.  This will only take effect if the perl "DBI" module is
380           installed, and the configuration parameters "user_scores_dsn",
381           "user_scores_sql_username", and "user_scores_sql_password" are set
382           correctly.
383
384           The username in $username will also be used for the "username"
385           attribute of the Mail::SpamAssassin object.
386
387       $f->load_scoreonly_ldap ($username)
388           Read configuration paramaters from an LDAP server and parse scores
389           from it.  This will only take effect if the perl "Net::LDAP" and
390           "URI" modules are installed, and the configuration parameters
391           "user_scores_dsn", "user_scores_ldap_username", and
392           "user_scores_ldap_password" are set correctly.
393
394           The username in $username will also be used for the "username"
395           attribute of the Mail::SpamAssassin object.
396
397       $f->set_persistent_address_list_factory ($factoryobj)
398           Set the persistent address list factory, used to create objects for
399           the automatic whitelist algorithm's persistent-storage back-end.
400           See "Mail::SpamAssassin::PersistentAddrList" for the API these fac‐
401           tory objects must implement, and the API the objects they produce
402           must implement.
403
404       $f->compile_now ($use_user_prefs, $keep_userstate)
405           Compile all patterns, load all configuration files, and load all
406           possibly-required Perl modules.
407
408           Normally, Mail::SpamAssassin uses lazy evaluation where possible,
409           but if you plan to fork() or start a new perl interpreter thread to
410           process a message, this is suboptimal, as each process/thread will
411           have to perform these actions.
412
413           Call this function in the master thread or process to perform the
414           actions straightaway, so that the sub-processes will not have to.
415
416           If $use_user_prefs is 0, this will initialise the SpamAssassin con‐
417           figuration without reading the per-user configuration file and it
418           will assume that you will call "read_scoreonly_config" at a later
419           point.
420
421           If $keep_userstate is true, compile_now() will revert any configu‐
422           ration options which have a default with __userstate__ in it
423           post-init(), and then re-change the option before returning.  This
424           lets you change $ENV{'HOME'} to a temp directory, have com‐
425           pile_now() and create any files there as necessary without disturb‐
426           ing the actual files as changed by a configuration option.  By
427           default, this is disabled.
428
429       $f->debug_diagnostics ()
430           Output some diagnostic information, useful for debugging SpamAssas‐
431           sin problems.
432
433       $failed = $f->lint_rules ()
434           Syntax-check the current set of rules.  Returns the number of syn‐
435           tax errors discovered, or 0 if the configuration is valid.
436
437       $f->finish()
438           Destroy this object, so that it will be garbage-collected once it
439           goes out of scope.  The object will no longer be usable after this
440           method is called.
441
442       $fullpath = $f->find_rule_support_file ($filename)
443           Find a rule-support file, such as "languages" or "triplets.txt", in
444           the system-wide rules directory, and return its full path if it
445           exists, or undef if it doesn't exist.
446
447           (This API was added in SpamAssassin 3.1.1.)
448
449       $f->create_default_prefs ($filename, $username [ , $userdir ] )
450           Copy default preferences file into home directory for later use and
451           modification, if it does not already exist and "dont_copy_prefs" is
452           not set.
453
454       $f->copy_config ( [ $source ], [ $dest ] )
455           Used for daemons to keep a persistent Mail::SpamAssassin object's
456           configuration correct if switching between users.  Pass an associa‐
457           tive array reference as either $source or $dest, and set the other
458           to 'undef' so that the object will use its current configuration.
459           i.e.:
460
461             # create object w/ configuration
462             my $spamtest = Mail::SpamAssassin->new( ... );
463
464             # backup configuration to %conf_backup
465             my %conf_backup = ();
466             $spamtest->copy_config(undef, \%conf_backup) ⎪⎪
467               die "config: error returned from copy_config!\n";
468
469             ... do stuff, perhaps modify the config, etc ...
470
471             # reset the configuration back to the original
472             $spamtest->copy_config(\%conf_backup, undef) ⎪⎪
473               die "config: error returned from copy_config!\n";
474
475           Note that the contents of the associative arrays should be consid‐
476           ered opaque by calling code.
477
478       @plugins = $f->get_loaded_plugins_list ( )
479           Return the list of plugins currently loaded by this SpamAssassin
480           object's configuration; each entry in the list is an object of type
481           "Mail::SpamAssassin::Plugin".
482
483           (This API was added in SpamAssassin 3.2.0.)
484

PREREQUISITES

486       "HTML::Parser" "Sys::Syslog"
487

MORE DOCUMENTATION

489       See also <http://spamassassin.apache.org/> and
490       <http://wiki.apache.org/spamassassin/> for more information.
491

SEE ALSO

493       Mail::SpamAssassin::Conf(3) Mail::SpamAssassin::PerMsgStatus(3) spamas‐
494       sassin(1) sa-update(1)
495

BUGS

497       See <http://issues.apache.org/SpamAssassin/>
498

AUTHORS

500       The SpamAssassin(tm) Project <http://spamassassin.apache.org/>
501
503       SpamAssassin is distributed under the Apache License, Version 2.0, as
504       described in the file "LICENSE" included with the distribution.
505

AVAILABILITY

507       The latest version of this library is likely to be available from CPAN
508       as well as:
509
510         E<lt>http://spamassassin.apache.org/E<gt>
511
512
513
514perl v5.8.8                       2008-01-05             Mail::SpamAssassin(3)
Impressum