Mail::SpamAssassin::PerMsgStatus(3pm)

1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4

NAME

6       Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7       not-spam)
8

SYNOPSIS

10         my $spamtest = new Mail::SpamAssassin ({
11           'rules_filename'      => '/etc/spamassassin.rules',
12           'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
13         });
14         my $mail = $spamtest->parse();
15
16         my $status = $spamtest->check ($mail);
17
18         my $rewritten_mail;
19         if ($status->is_spam()) {
20           $rewritten_mail = $status->rewrite_mail ();
21         }
22         ...
23

DESCRIPTION

25       The Mail::SpamAssassin "check()" method returns an object of this
26       class.  This object encapsulates all the per-message state.
27

METHODS

29       $status->check ()
30           Runs the SpamAssassin rules against the message pointed to by the
31           object.
32
33       $status->learn()
34           After a mail message has been checked, this method can be called.
35           If the score is outside a certain range around the threshold, ie.
36           if the message is judged more-or-less definitely spam or definitely
37           non-spam, it will be fed into SpamAssassin's learning systems
38           (currently the naive Bayesian classifier), so that future similar
39           mails will be caught.
40
41       $score = $status->get_autolearn_points()
42           Return the message's score as computed for auto-learning.  Certain
43           tests are ignored:
44
45             - rules with tflags set to 'learn' (the Bayesian rules)
46
47             - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49             - rules with tflags set to 'noautolearn'
50
51           Also note that auto-learning occurs using scores from either
52           scoreset 0 or 1, depending on what scoreset is used during message
53           check.  It is likely that the message check and auto-learn scores
54           will be different.
55
56       $score = $status->get_head_only_points()
57           Return the message's score as computed for auto-learning, ignoring
58           all rules except for header-based ones.
59
60       $score = $status->get_learned_points()
61           Return the message's score as computed for auto-learning, ignoring
62           all rules except for learning-based ones.
63
64       $score = $status->get_body_only_points()
65           Return the message's score as computed for auto-learning, ignoring
66           all rules except for body-based ones.
67
68       $score = $status->get_autolearn_force_status()
69           Return whether a message's score included any rules that are
70           flagged as autolearn_force.
71
72       $rule_names = $status->get_autolearn_force_names()
73           Return a list of comma separated list of rule names if a message's
74           score included any rules that are flagged as autolearn_force.
75
76       $isspam = $status->is_spam ()
77           After a mail message has been checked, this method can be called.
78           It will return 1 for mail determined likely to be spam, 0 if it
79           does not seem spam-like.
80
81       $list = $status->get_names_of_tests_hit ()
82           After a mail message has been checked, this method can be called.
83           It will return a comma-separated string, listing all the symbolic
84           test names of the tests which were triggered by the mail.
85
86       $list = $status->get_names_of_tests_hit_with_scores_hash ()
87           After a mail message has been checked, this method can be called.
88           It will return a pointer to a hash for rule & score pairs for all
89           the symbolic test names and individual scores of the tests which
90           were triggered by the mail.
91
92       $list = $status->get_names_of_tests_hit_with_scores ()
93           After a mail message has been checked, this method can be called.
94           It will return a comma-separated string of rule=score pairs for all
95           the symbolic test names and individual scores of the tests which
96           were triggered by the mail.
97
98       $list = $status->get_names_of_subtests_hit ()
99           After a mail message has been checked, this method can be called.
100           It will return a comma-separated string, listing all the symbolic
101           test names of the meta-rule sub-tests which were triggered by the
102           mail.  Sub-tests are the normally-hidden rules, which score 0 and
103           have names beginning with two underscores, used in meta rules.
104
105       $num = $status->get_score ()
106           After a mail message has been checked, this method can be called.
107           It will return the message's score.
108
109       $num = $status->get_required_score ()
110           After a mail message has been checked, this method can be called.
111           It will return the score required for a mail to be considered spam.
112
113       $num = $status->get_autolearn_status ()
114           After a mail message has been checked, this method can be called.
115           It will return one of the following strings depending on whether
116           the mail was auto-learned or not: "ham", "no", "spam", "disabled",
117           "failed", "unavailable".
118
119           It also returns is flagged with auto_learn_force, it will also
120           include the status and the rules hit.  For example:
121           "autolearn_force=yes (AUTOLEARNTEST_BODY)"
122
123       $report = $status->get_report ()
124           Deliver a "spam report" on the checked mail message.  This contains
125           details of how many spam detection rules it triggered.
126
127           The report is returned as a multi-line string, with the lines
128           separated by "\n" characters.
129
130       $preview = $status->get_content_preview ()
131           Give a "preview" of the content.
132
133           This is returned as a multi-line string, with the lines separated
134           by "\n" characters, containing a fully-decoded, safe, plain-text
135           sample of the first few lines of the message body.
136
137       $msg = $status->get_message()
138           Return the object representing the message being scanned.
139
140       $status->rewrite_mail ()
141           Rewrite the mail message.  This will at minimum add headers, and at
142           maximum MIME-encapsulate the message text, to reflect its spam or
143           not-spam status.  The function will return a scalar of the
144           rewritten message.
145
146           The actual modifications depend on the configuration (see
147           "Mail::SpamAssassin::Conf" for more information).
148
149           The possible modifications are as follows:
150
151           To:, From: and Subject: modification on spam mails
152               Depending on the configuration, the To: and From: lines can
153               have a user-defined RFC 2822 comment appended for spam mail.
154               The subject line may have a user-defined string prepended to it
155               for spam mail.
156
157           X-Spam-* headers for all mails
158               Depending on the configuration, zero or more headers with names
159               beginning with "X-Spam-" will be added to mail depending on
160               whether it is spam or ham.
161
162           spam message with report_safe
163               If report_safe is set to true (1), then spam messages are
164               encapsulated into their own message/rfc822 MIME attachment
165               without any modifications being made.
166
167               If report_safe is set to false (0), then the message will only
168               have the above headers added/modified.
169
170       $status->action_depends_on_tags($tags, $code, @args)
171           Enqueue the supplied subroutine reference $code, to become runnable
172           when all the specified tags become available. The $tags may be a
173           simple scalar - a tag name, or a listref of tag names. The
174           subroutine &$code when called will be passed a "permessagestatus"
175           object as its first argument, followed by the supplied (optional)
176           list @args .
177
178       $status->set_tag($tagname, $value)
179           Set a template tag, as used in "add_header", report templates, etc.
180           This API is intended for use by plugins.  Tag names will be
181           converted to an all-uppercase representation internally.
182
183           $value can be a simple scalar (string or number), or a reference to
184           an array, in which case the public method get_tag will join array
185           elements using a space as a separator, returning a single string
186           for backward compatibility.
187
188           $value can also be a subroutine reference, which will be evaluated
189           each time the template is expanded. The first argument passed by
190           get_tag to a called subroutine will be a PerMsgStatus object (this
191           module's object), followed by optional arguments provided a caller
192           to get_tag.
193
194           Note that perl supports closures, which means that variables set in
195           the caller's scope can be accessed inside this "sub". For example:
196
197               my $text = "hello world!";
198               $status->set_tag("FOO", sub {
199                         my $pms = shift;
200                         return $text;
201                       });
202
203           See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
204           details on how template tags are used.
205
206           "undef" will be returned if a tag by that name has not been
207           defined.
208
209       $string = $status->get_tag($tagname)
210           Get the current value of a template tag, as used in "add_header",
211           report templates, etc. This API is intended for use by plugins.
212           Tag names will be converted to an all-uppercase representation
213           internally.  See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
214           section for more details on tags.
215
216           "undef" will be returned if a tag by that name has not been
217           defined.
218
219       $string = $status->get_tag_raw($tagname, @args)
220           Similar to "get_tag", but keeps a tag name unchanged (does not
221           uppercase it), and does not convert arrayref tag values into a
222           single string.
223
224       $status->set_spamd_result_item($subref)
225           Set an entry for the spamd result log line.  $subref should be a
226           code reference for a subroutine which will return a string in
227           'name=VALUE' format, similar to the other entries in the spamd
228           result line:
229
230             Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
231             DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
232             TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
233             TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
234             uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
235             rport=33153,mid=<9PS291LhupY>,autolearn=spam
236
237           "name" and "VALUE" must not contain "=" or "," characters, as it is
238           important that these log lines are easy to parse.
239
240           The code reference will be called by spamd after the message has
241           been scanned, and the "PerMsgStatus::check()" method has returned.
242
243       $status->finish ()
244           Indicate that this $status object is finished with, and can be
245           destroyed.
246
247           If you are using SpamAssassin in a persistent environment, or
248           checking many mail messages from one "Mail::SpamAssassin" factory,
249           this method should be called to ensure Perl's garbage collection
250           will clean up old status objects.
251
252       $name = $status->get_current_eval_rule_name()
253           Return the name of the currently-running eval rule.  "undef" is
254           returned if no eval rule is currently being run.  Useful for
255           plugins to determine the current rule name while inside an eval
256           test function call.
257
258       $status->get_decoded_body_text_array ()
259           Returns the message body, with base64 or quoted-printable encodings
260           decoded, and non-text parts or non-inline attachments stripped.
261
262           It is returned as an array of strings, with each string
263           representing one newline-separated line of the body.
264
265       $status->get_decoded_stripped_body_text_array ()
266           Returns the message body, decoded (as described in
267           get_decoded_body_text_array()), with HTML rendered, and with
268           whitespace normalized.
269
270           It will always render text/html, and will use a heuristic to
271           determine if other text/* parts should be considered text/html.
272
273           It is returned as an array of strings, with each string
274           representing one 'paragraph'.  Paragraphs, in plain-text mails, are
275           double-newline-separated blocks of multi-line text.
276
277       $status->get (header_name [, default_value])
278           Returns a message header, pseudo-header, real name or address.
279           "header_name" is the name of a mail header, such as 'Subject',
280           'To', etc.  If "default_value" is given, it will be used if the
281           requested "header_name" does not exist.
282
283           Appending ":raw" to the header name will inhibit decoding of
284           quoted-printable or base-64 encoded strings.
285
286           Appending a modifier ":addr" to a header field name will cause
287           everything except the first email address to be removed from the
288           header field.  It is mainly applicable to header fields 'From',
289           'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
290           the 'Return-Path'. For example, all of the following will result in
291           "example@foo":
292
293           example@foo
294           example@foo (Foo Blah)
295           example@foo, example@bar
296           display: example@foo (Foo Blah), example@bar ;
297           Foo Blah <example@foo>
298           "Foo Blah" <example@foo>
299           "'Foo Blah'" <example@foo>
300
301           Appending a modifier ":name" to a header field name will cause
302           everything except the first display name to be removed from the
303           header field. It is mainly applicable to header fields containing a
304           single mail address: 'From', 'Sender', along with their
305           'Resent-From' and 'Resent-Sender' counterparts.  For example, all
306           of the following will result in "Foo Blah". One level of single
307           quotes is stripped too, as it is often seen.
308
309           example@foo (Foo Blah)
310           example@foo (Foo Blah), example@bar
311           display: example@foo (Foo Blah), example@bar ;
312           Foo Blah <example@foo>
313           "Foo Blah" <example@foo>
314           "'Foo Blah'" <example@foo>
315
316           There are several special pseudo-headers that can be specified:
317
318           "ALL" can be used to mean the text of all the message's headers.
319           "ALL-TRUSTED" can be used to mean the text of all the message's
320           headers that could only have been added by trusted relays.
321           "ALL-INTERNAL" can be used to mean the text of all the message's
322           headers that could only have been added by internal relays.
323           "ALL-UNTRUSTED" can be used to mean the text of all the message's
324           headers that may have been added by untrusted relays.  To make this
325           pseudo-header more useful for header rules the 'Received' header
326           that was added by the last trusted relay is included, even though
327           it can be trusted.
328           "ALL-EXTERNAL" can be used to mean the text of all the message's
329           headers that may have been added by external relays.  Like
330           "ALL-UNTRUSTED" the 'Received' header added by the last internal
331           relay is included.
332           "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
333           headers.
334           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
335           SMTP transaction that delivered this message, if this data has been
336           made available by the SMTP server.
337           "MESSAGEID" is a symbol meaning all Message-Id's found in the
338           message; some mailing list software moves the real 'Message-Id' to
339           'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
340           'Message-Id' header.  The value returned for this symbol is the
341           text from all 3 headers, separated by newlines.
342           "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
343           relays the message has passed through
344           "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
345           the message has passed through
346       $status->get_uri_list ()
347           Returns an array of all unique URIs found in the message.  It takes
348           a combination of the URIs found in the rendered (decoded and HTML
349           stripped) body and the URIs found when parsing the HTML in the
350           message.  Will also set $status->{uri_list} (the array as returned
351           by this function).
352
353           The returned array will include the "raw" URI as well as "slightly
354           cooked" versions.  For example, the single URI
355           'http://%77&#00119;%77.example.com/' will get turned into: (
356           'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )
357
358       $status->get_uri_detail_list ()
359           Returns a hash reference of all unique URIs found in the message
360           and various data about where the URIs were found in the message.
361           It takes a combination of the URIs found in the rendered (decoded
362           and HTML stripped) body and the URIs found when parsing the HTML in
363           the message.  Will also set $status->{uri_detail_list} (the hash
364           reference as returned by this function).  This function will also
365           set $status->{uri_domain_count} (count of unique domains).
366
367           The hash format looks something like this:
368
369             raw_uri => {
370               types => { a => 1, img => 1, parsed => 1 },
371               cleaned => [ canonicalized_uri ],
372               anchor_text => [ "click here", "no click here" ],
373               domains => { domain1 => 1, domain2 => 1 },
374             }
375
376           "raw_uri" is whatever the URI was in the message itself
377           (http://spamassassin.apache%2Eorg/).
378
379           "types" is a hash of the HTML tags (lowercase) which referenced the
380           raw_uri.  parsed is a faked type which specifies that the raw_uri
381           was seen in the rendered text.
382
383           "cleaned" is an array of the raw and canonicalized version of the
384           raw_uri (http://spamassassin.apache%2Eorg/,
385           http://spamassassin.apache.org/).
386
387           "anchor_text" is an array of the anchor text (text between <a> and
388           </a>), if any, which linked to the URI.
389
390           "domains" is a hash of the domains found in the canonicalized URIs.
391
392           "hosts" is a hash of unstripped hostnames found in the
393           canonicalized URIs as hash keys, with their domain part stored as a
394           value of each hash entry.
395
396       $status->clear_test_state()
397           Clear test state, including test log messages from
398           "$status->test_log()".
399
400       $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
401           Register a hit against a rule in the ruleset.
402
403           There are two mandatory arguments. These are $rulename, the name of
404           the rule that fired, and $desc_prepend, which is a short string
405           that will be prepended to the rules "describe" string in output
406           reports.
407
408           In addition, callers can supplement that with the following
409           optional data:
410
411           score => $num
412               Optional: the score to use for the rule hit.  If unspecified,
413               the value from the "Mail::SpamAssassin::Conf" object's
414               "{scores}" hash will be used (a configured score), and in its
415               absence the "defscore" option value.
416
417           defscore => $num
418               Optional: the score to use for the rule hit if neither the
419               option "score" is provided, nor a configured score value is
420               provided.
421
422           value => $num
423               Optional: the value to assign to the rule; the default value is
424               1.  tflags multiple rules use values of greater than 1 to
425               indicate multiple hits.  This value is accessible to meta
426               rules.
427
428           ruletype => $type
429               Optional, but recommended: the rule type string.  This is used
430               in the "hit_rule" plugin call, called by this method.  If
431               unset, 'unknown' is used.
432
433           tflags => $string
434               Optional: a string, i.e. a space-separated list of additional
435               tflags to be appended to an existing list of flags in
436               $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
437               No syntax checks are performed.
438
439           description => $string
440               Optional: a custom rule description string.  This is used in
441               the "hit_rule" plugin call, called by this method. If unset,
442               the static description is used.
443
444           Backward compatibility: the two mandatory arguments have been part
445           of this API since SpamAssassin 2.x.  The optional name=<gtvalue>
446           pairs, however, are a new addition in SpamAssassin 3.2.0.
447
448       $status->create_fulltext_tmpfile (fulltext_ref)
449           This function creates a temporary file containing the passed scalar
450           reference data (typically the full/pristine text of the message).
451           This is typically used by external programs like pyzor and dccproc,
452           to avoid hangs due to buffering issues.   Methods that need this,
453           should call $self->create_fulltext_tmpfile($fulltext) to retrieve
454           the temporary filename; it will be created if it has not already
455           been.
456
457           Note: This can only be called once until
458           $status->delete_fulltext_tmpfile() is called.
459
460       $status->delete_fulltext_tmpfile ()
461           Will cleanup after a $status->create_fulltext_tmpfile() call.
462           Deletes the temporary file and uncaches the filename.
463
464       all_from_addrs_domains
465           This function returns all the various from addresses in a message
466           using all_from_addrs() and then returns only the domain names.
467

NAME

SYNOPSIS

DESCRIPTION

METHODS

SEE ALSO