Mail::SpamAssassin::PerMsgStatus(3pm)

1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4

NAME

6       Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7       not-spam)
8

SYNOPSIS

10         my $spamtest = new Mail::SpamAssassin ({
11           'rules_filename'      => '/etc/spamassassin.rules',
12           'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
13         });
14         my $mail = $spamtest->parse();
15
16         my $status = $spamtest->check ($mail);
17
18         my $rewritten_mail;
19         if ($status->is_spam()) {
20           $rewritten_mail = $status->rewrite_mail ();
21         }
22         ...
23

DESCRIPTION

25       The Mail::SpamAssassin "check()" method returns an object of this
26       class.  This object encapsulates all the per-message state.
27

METHODS

29       $status->check ()
30           Runs the SpamAssassin rules against the message pointed to by the
31           object.
32
33       $status->learn()
34           After a mail message has been checked, this method can be called.
35           If the score is outside a certain range around the threshold, ie.
36           if the message is judged more-or-less definitely spam or definitely
37           non-spam, it will be fed into SpamAssassin's learning systems
38           (currently the naive Bayesian classifier), so that future similar
39           mails will be caught.
40
41       $score = $status->get_autolearn_points()
42           Return the message's score as computed for auto-learning.  Certain
43           tests are ignored:
44
45             - rules with tflags set to 'learn' (the Bayesian rules)
46
47             - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49             - rules with tflags set to 'noautolearn'
50
51           Also note that auto-learning occurs using scores from either
52           scoreset 0 or 1, depending on what scoreset is used during message
53           check.  It is likely that the message check and auto-learn scores
54           will be different.
55
56       $score = $status->get_head_only_points()
57           Return the message's score as computed for auto-learning, ignoring
58           all rules except for header-based ones.
59
60       $score = $status->get_learned_points()
61           Return the message's score as computed for auto-learning, ignoring
62           all rules except for learning-based ones.
63
64       $score = $status->get_body_only_points()
65           Return the message's score as computed for auto-learning, ignoring
66           all rules except for body-based ones.
67
68       $score = $status->get_autolearn_force_status()
69           Return whether a message's score included any rules that are
70           flagged as autolearn_force.
71
72       $rule_names = $status->get_autolearn_force_names()
73           Return a list of comma separated list of rule names if a message's
74           score included any rules that are flagged as autolearn_force.
75
76       $isspam = $status->is_spam ()
77           After a mail message has been checked, this method can be called.
78           It will return 1 for mail determined likely to be spam, 0 if it
79           does not seem spam-like.
80
81       $list = $status->get_names_of_tests_hit ()
82           After a mail message has been checked, this method can be called.
83           It will return a comma-separated string, listing all the symbolic
84           test names of the tests which were trigged by the mail.
85
86       $list = $status->get_names_of_subtests_hit ()
87           After a mail message has been checked, this method can be called.
88           It will return a comma-separated string, listing all the symbolic
89           test names of the meta-rule sub-tests which were trigged by the
90           mail.  Sub-tests are the normally-hidden rules, which score 0 and
91           have names beginning with two underscores, used in meta rules.
92
93       $num = $status->get_score ()
94           After a mail message has been checked, this method can be called.
95           It will return the message's score.
96
97       $num = $status->get_required_score ()
98           After a mail message has been checked, this method can be called.
99           It will return the score required for a mail to be considered spam.
100
101       $num = $status->get_autolearn_status ()
102           After a mail message has been checked, this method can be called.
103           It will return one of the following strings depending on whether
104           the mail was auto-learned or not: "ham", "no", "spam", "disabled",
105           "failed", "unavailable".
106
107           It also returns is flagged with auto_learn_force, it will also
108           include the status and the rules hit.  For example:
109           "autolearn_force=yes (AUTOLEARNTEST_BODY)"
110
111       $report = $status->get_report ()
112           Deliver a "spam report" on the checked mail message.  This contains
113           details of how many spam detection rules it triggered.
114
115           The report is returned as a multi-line string, with the lines
116           separated by "\n" characters.
117
118       $preview = $status->get_content_preview ()
119           Give a "preview" of the content.
120
121           This is returned as a multi-line string, with the lines separated
122           by "\n" characters, containing a fully-decoded, safe, plain-text
123           sample of the first few lines of the message body.
124
125       $msg = $status->get_message()
126           Return the object representing the message being scanned.
127
128       $status->rewrite_mail ()
129           Rewrite the mail message.  This will at minimum add headers, and at
130           maximum MIME-encapsulate the message text, to reflect its spam or
131           not-spam status.  The function will return a scalar of the
132           rewritten message.
133
134           The actual modifications depend on the configuration (see
135           "Mail::SpamAssassin::Conf" for more information).
136
137           The possible modifications are as follows:
138
139           To:, From: and Subject: modification on spam mails
140               Depending on the configuration, the To: and From: lines can
141               have a user-defined RFC 2822 comment appended for spam mail.
142               The subject line may have a user-defined string prepended to it
143               for spam mail.
144
145           X-Spam-* headers for all mails
146               Depending on the configuration, zero or more headers with names
147               beginning with "X-Spam-" will be added to mail depending on
148               whether it is spam or ham.
149
150           spam message with report_safe
151               If report_safe is set to true (1), then spam messages are
152               encapsulated into their own message/rfc822 MIME attachment
153               without any modifications being made.
154
155               If report_safe is set to false (0), then the message will only
156               have the above headers added/modified.
157
158       $status->action_depends_on_tags($tags, $code, @args)
159           Enqueue the supplied subroutine reference $code, to become runnable
160           when all the specified tags become available. The $tags may be a
161           simple scalar - a tag name, or a listref of tag names. The
162           subroutine &$code when called will be passed a "permessagestatus"
163           object as its first argument, followed by the supplied (optional)
164           list @args .
165
166       $status->set_tag($tagname, $value)
167           Set a template tag, as used in "add_header", report templates, etc.
168           This API is intended for use by plugins.  Tag names will be
169           converted to an all-uppercase representation internally.
170
171           $value can be a simple scalar (string or number), or a reference to
172           an array, in which case the public method get_tag will join array
173           elements using a space as a separator, returning a single string
174           for backward compatibility.
175
176           $value can also be a subroutine reference, which will be evaluated
177           each time the template is expanded. The first argument passed by
178           get_tag to a called subroutine will be a PerMsgStatus object (this
179           module's object), followed by optional arguments provided a caller
180           to get_tag.
181
182           Note that perl supports closures, which means that variables set in
183           the caller's scope can be accessed inside this "sub". For example:
184
185               my $text = "hello world!";
186               $status->set_tag("FOO", sub {
187                         my $pms = shift;
188                         return $text;
189                       });
190
191           See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
192           details on how template tags are used.
193
194           "undef" will be returned if a tag by that name has not been
195           defined.
196
197       $string = $status->get_tag($tagname)
198           Get the current value of a template tag, as used in "add_header",
199           report templates, etc. This API is intended for use by plugins.
200           Tag names will be converted to an all-uppercase representation
201           internally.  See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
202           section for more details on tags.
203
204           "undef" will be returned if a tag by that name has not been
205           defined.
206
207       $string = $status->get_tag_raw($tagname, @args)
208           Similar to "get_tag", but keeps a tag name unchanged (does not
209           uppercase it), and does not convert arrayref tag values into a
210           single string.
211
212       $status->set_spamd_result_item($subref)
213           Set an entry for the spamd result log line.  $subref should be a
214           code reference for a subroutine which will return a string in
215           'name=VALUE' format, similar to the other entries in the spamd
216           result line:
217
218             Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
219             DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
220             TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
221             TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
222             uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
223             rport=33153,mid=<9PS291LhupY>,autolearn=spam
224
225           "name" and "VALUE" must not contain "=" or "," characters, as it is
226           important that these log lines are easy to parse.
227
228           The code reference will be called by spamd after the message has
229           been scanned, and the "PerMsgStatus::check()" method has returned.
230
231       $status->finish ()
232           Indicate that this $status object is finished with, and can be
233           destroyed.
234
235           If you are using SpamAssassin in a persistent environment, or
236           checking many mail messages from one "Mail::SpamAssassin" factory,
237           this method should be called to ensure Perl's garbage collection
238           will clean up old status objects.
239
240       $name = $status->get_current_eval_rule_name()
241           Return the name of the currently-running eval rule.  "undef" is
242           returned if no eval rule is currently being run.  Useful for
243           plugins to determine the current rule name while inside an eval
244           test function call.
245
246       $status->get_decoded_body_text_array ()
247           Returns the message body, with base64 or quoted-printable encodings
248           decoded, and non-text parts or non-inline attachments stripped.
249
250           It is returned as an array of strings, with each string
251           representing one newline-separated line of the body.
252
253       $status->get_decoded_stripped_body_text_array ()
254           Returns the message body, decoded (as described in
255           get_decoded_body_text_array()), with HTML rendered, and with
256           whitespace normalized.
257
258           It will always render text/html, and will use a heuristic to
259           determine if other text/* parts should be considered text/html.
260
261           It is returned as an array of strings, with each string
262           representing one 'paragraph'.  Paragraphs, in plain-text mails, are
263           double-newline-separated blocks of multi-line text.
264
265       $status->get (header_name [, default_value])
266           Returns a message header, pseudo-header, real name or address.
267           "header_name" is the name of a mail header, such as 'Subject',
268           'To', etc.  If "default_value" is given, it will be used if the
269           requested "header_name" does not exist.
270
271           Appending ":raw" to the header name will inhibit decoding of
272           quoted-printable or base-64 encoded strings.
273
274           Appending a modifier ":addr" to a header field name will cause
275           everything except the first email address to be removed from the
276           header field.  It is mainly applicable to header fields 'From',
277           'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
278           the 'Return-Path'. For example, all of the following will result in
279           "example@foo":
280
281           example@foo
282           example@foo (Foo Blah)
283           example@foo, example@bar
284           display: example@foo (Foo Blah), example@bar ;
285           Foo Blah <example@foo>
286           "Foo Blah" <example@foo>
287           "'Foo Blah'" <example@foo>
288
289           Appending a modifier ":name" to a header field name will cause
290           everything except the first display name to be removed from the
291           header field. It is mainly applicable to header fields containing a
292           single mail address: 'From', 'Sender', along with their
293           'Resent-From' and 'Resent-Sender' counterparts.  For example, all
294           of the following will result in "Foo Blah". One level of single
295           quotes is stripped too, as it is often seen.
296
297           example@foo (Foo Blah)
298           example@foo (Foo Blah), example@bar
299           display: example@foo (Foo Blah), example@bar ;
300           Foo Blah <example@foo>
301           "Foo Blah" <example@foo>
302           "'Foo Blah'" <example@foo>
303
304           There are several special pseudo-headers that can be specified:
305
306           "ALL" can be used to mean the text of all the message's headers.
307           "ALL-TRUSTED" can be used to mean the text of all the message's
308           headers that could only have been added by trusted relays.
309           "ALL-INTERNAL" can be used to mean the text of all the message's
310           headers that could only have been added by internal relays.
311           "ALL-UNTRUSTED" can be used to mean the text of all the message's
312           headers that may have been added by untrusted relays.  To make this
313           pseudo-header more useful for header rules the 'Received' header
314           that was added by the last trusted relay is included, even though
315           it can be trusted.
316           "ALL-EXTERNAL" can be used to mean the text of all the message's
317           headers that may have been added by external relays.  Like
318           "ALL-UNTRUSTED" the 'Received' header added by the last internal
319           relay is included.
320           "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
321           headers.
322           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
323           SMTP transaction that delivered this message, if this data has been
324           made available by the SMTP server.
325           "MESSAGEID" is a symbol meaning all Message-Id's found in the
326           message; some mailing list software moves the real 'Message-Id' to
327           'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
328           'Message-Id' header.  The value returned for this symbol is the
329           text from all 3 headers, separated by newlines.
330           "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
331           relays the message has passed through
332           "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
333           the message has passed through
334       $status->get_uri_list ()
335           Returns an array of all unique URIs found in the message.  It takes
336           a combination of the URIs found in the rendered (decoded and HTML
337           stripped) body and the URIs found when parsing the HTML in the
338           message.  Will also set $status->{uri_list} (the array as returned
339           by this function).
340
341           The returned array will include the "raw" URI as well as "slightly
342           cooked" versions.  For example, the single URI
343           'http://%77&#00119;%77.example.com/' will get turned into: (
344           'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )
345
346       $status->get_uri_detail_list ()
347           Returns a hash reference of all unique URIs found in the message
348           and various data about where the URIs were found in the message.
349           It takes a combination of the URIs found in the rendered (decoded
350           and HTML stripped) body and the URIs found when parsing the HTML in
351           the message.  Will also set $status->{uri_detail_list} (the hash
352           reference as returned by this function).  This function will also
353           set $status->{uri_domain_count} (count of unique domains).
354
355           The hash format looks something like this:
356
357             raw_uri => {
358               types => { a => 1, img => 1, parsed => 1 },
359               cleaned => [ canonified_uri ],
360               anchor_text => [ "click here", "no click here" ],
361               domains => { domain1 => 1, domain2 => 1 },
362             }
363
364           "raw_uri" is whatever the URI was in the message itself
365           (http://spamassassin.apache%2Eorg/).
366
367           "types" is a hash of the HTML tags (lowercase) which referenced the
368           raw_uri.  parsed is a faked type which specifies that the raw_uri
369           was seen in the rendered text.
370
371           "cleaned" is an array of the raw and canonified version of the
372           raw_uri (http://spamassassin.apache%2Eorg/,
373           http://spamassassin.apache.org/).
374
375           "anchor_text" is an array of the anchor text (text between <a> and
376           </a>), if any, which linked to the URI.
377
378           "domains" is a hash of the domains found in the canonified URIs.
379
380           "hosts" is a hash of unstripped hostnames found in the canonified
381           URIs as hash keys, with their domain part stored as a value of each
382           hash entry.
383
384       $status->clear_test_state()
385           Clear test state, including test log messages from
386           "$status->test_log()".
387
388       $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
389           Register a hit against a rule in the ruleset.
390
391           There are two mandatory arguments. These are $rulename, the name of
392           the rule that fired, and $desc_prepend, which is a short string
393           that will be prepended to the rules "describe" string in output
394           reports.
395
396           In addition, callers can supplement that with the following
397           optional data:
398
399           score => $num
400               Optional: the score to use for the rule hit.  If unspecified,
401               the value from the "Mail::SpamAssassin::Conf" object's
402               "{scores}" hash will be used (a configured score), and in its
403               absence the "defscore" option value.
404
405           defscore => $num
406               Optional: the score to use for the rule hit if neither the
407               option "score" is provided, nor a configured score value is
408               provided.
409
410           value => $num
411               Optional: the value to assign to the rule; the default value is
412               1.  tflags multiple rules use values of greater than 1 to
413               indicate multiple hits.  This value is accessible to meta
414               rules.
415
416           ruletype => $type
417               Optional, but recommended: the rule type string.  This is used
418               in the "hit_rule" plugin call, called by this method.  If
419               unset, 'unknown' is used.
420
421           tflags => $string
422               Optional: a string, i.e. a space-separated list of additional
423               tflags to be appended to an existing list of flags in
424               $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
425               No syntax checks are performed.
426
427           description => $string
428               Optional: a custom rule description string.  This is used in
429               the "hit_rule" plugin call, called by this method. If unset,
430               the static description is used.
431
432           Backward compatibility: the two mandatory arguments have been part
433           of this API since SpamAssassin 2.x.  The optional name=<gtvalue>
434           pairs, however, are a new addition in SpamAssassin 3.2.0.
435
436       $status->create_fulltext_tmpfile (fulltext_ref)
437           This function creates a temporary file containing the passed scalar
438           reference data (typically the full/pristine text of the message).
439           This is typically used by external programs like pyzor and dccproc,
440           to avoid hangs due to buffering issues.   Methods that need this,
441           should call $self->create_fulltext_tmpfile($fulltext) to retrieve
442           the temporary filename; it will be created if it has not already
443           been.
444
445           Note: This can only be called once until
446           $status->delete_fulltext_tmpfile() is called.
447
448       $status->delete_fulltext_tmpfile ()
449           Will cleanup after a $status->create_fulltext_tmpfile() call.
450           Deletes the temporary file and uncaches the filename.
451
452       all_from_addrs_domains
453           This function returns all the various from addresses in a message
454           using all_from_addrs() and then returns only the domain names.
455

NAME

SYNOPSIS

DESCRIPTION

METHODS

SEE ALSO