Mail::SpamAssassin::PerMsgStatus(3pm)

1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4

NAME

6       Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7       not-spam)
8

SYNOPSIS

10         my $spamtest = new Mail::SpamAssassin ({
11           'rules_filename'      => '/etc/spamassassin.rules',
12           'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
13         });
14         my $mail = $spamtest->parse();
15
16         my $status = $spamtest->check ($mail);
17
18         my $rewritten_mail;
19         if ($status->is_spam()) {
20           $rewritten_mail = $status->rewrite_mail ();
21         }
22         ...
23

DESCRIPTION

25       The Mail::SpamAssassin "check()" method returns an object of this
26       class.  This object encapsulates all the per-message state.
27

METHODS

29       $status->check ()
30           Runs the SpamAssassin rules against the message pointed to by the
31           object.
32
33       $status->learn()
34           After a mail message has been checked, this method can be called.
35           If the score is outside a certain range around the threshold, ie.
36           if the message is judged more-or-less definitely spam or definitely
37           non-spam, it will be fed into SpamAssassin's learning systems
38           (currently the naive Bayesian classifier), so that future similar
39           mails will be caught.
40
41       $score = $status->get_autolearn_points()
42           Return the message's score as computed for auto-learning.  Certain
43           tests are ignored:
44
45             - rules with tflags set to 'learn' (the Bayesian rules)
46
47             - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49             - rules with tflags set to 'noautolearn'
50
51           Also note that auto-learning occurs using scores from either
52           scoreset 0 or 1, depending on what scoreset is used during message
53           check.  It is likely that the message check and auto-learn scores
54           will be different.
55
56       $score = $status->get_head_only_points()
57           Return the message's score as computed for auto-learning, ignoring
58           all rules except for header-based ones.
59
60       $score = $status->get_learned_points()
61           Return the message's score as computed for auto-learning, ignoring
62           all rules except for learning-based ones.
63
64       $score = $status->get_body_only_points()
65           Return the message's score as computed for auto-learning, ignoring
66           all rules except for body-based ones.
67
68       $score = $status->get_autolearn_force_status()
69           Return whether a message's score included any rules that are
70           flagged as autolearn_force.
71
72       $rule_names = $status->get_autolearn_force_names()
73           Return a list of comma separated list of rule names if a message's
74           score included any rules that are flagged as autolearn_force.
75
76       $isspam = $status->is_spam ()
77           After a mail message has been checked, this method can be called.
78           It will return 1 for mail determined likely to be spam, 0 if it
79           does not seem spam-like.
80
81       $list = $status->get_names_of_tests_hit ()
82           After a mail message has been checked, this method can be called.
83           It will return a comma-separated string, listing all the symbolic
84           test names of the tests which were triggered by the mail.
85
86       $list = $status->get_names_of_tests_hit_with_scores_hash ()
87           After a mail message has been checked, this method can be called.
88           It will return a pointer to a hash for rule & score pairs for all
89           the symbolic test names and individual scores of the tests which
90           were triggered by the mail.
91
92       $list = $status->get_names_of_tests_hit_with_scores ()
93           After a mail message has been checked, this method can be called.
94           It will return a comma-separated string of rule=score pairs for all
95           the symbolic test names and individual scores of the tests which
96           were triggered by the mail.
97
98       $list = $status->get_names_of_subtests_hit ()
99           After a mail message has been checked, this method can be called.
100           It will return a comma-separated string, listing all the symbolic
101           test names of the meta-rule sub-tests which were triggered by the
102           mail.  Sub-tests are the normally-hidden rules, which score 0 and
103           have names beginning with two underscores, used in meta rules.
104
105       $num = $status->get_score ()
106           After a mail message has been checked, this method can be called.
107           It will return the message's score.
108
109       $num = $status->get_required_score ()
110           After a mail message has been checked, this method can be called.
111           It will return the score required for a mail to be considered spam.
112
113       $num = $status->get_autolearn_status ()
114           After a mail message has been checked, this method can be called.
115           It will return one of the following strings depending on whether
116           the mail was auto-learned or not: "ham", "no", "spam", "disabled",
117           "failed", "unavailable".
118
119           It also returns is flagged with auto_learn_force, it will also
120           include the status and the rules hit.  For example:
121           "autolearn_force=yes (AUTOLEARNTEST_BODY)"
122
123       $report = $status->get_report ()
124           Deliver a "spam report" on the checked mail message.  This contains
125           details of how many spam detection rules it triggered.
126
127           The report is returned as a multi-line string, with the lines
128           separated by "\n" characters.
129
130       $preview = $status->get_content_preview ()
131           Give a "preview" of the content.
132
133           This is returned as a multi-line string, with the lines separated
134           by "\n" characters, containing a fully-decoded, safe, plain-text
135           sample of the first few lines of the message body.
136
137       $msg = $status->get_message()
138           Return the object representing the message being scanned.
139
140       $status->rewrite_mail ()
141           Rewrite the mail message.  This will at minimum add headers, and at
142           maximum MIME-encapsulate the message text, to reflect its spam or
143           not-spam status.  The function will return a scalar of the
144           rewritten message.
145
146           The actual modifications depend on the configuration (see
147           "Mail::SpamAssassin::Conf" for more information).
148
149           The possible modifications are as follows:
150
151           To:, From: and Subject: modification on spam mails
152               Depending on the configuration, the To: and From: lines can
153               have a user-defined RFC 2822 comment appended for spam mail.
154               The subject line may have a user-defined string prepended to it
155               for spam mail.
156
157           X-Spam-* headers for all mails
158               Depending on the configuration, zero or more headers with names
159               beginning with "X-Spam-" will be added to mail depending on
160               whether it is spam or ham.
161
162           spam message with report_safe
163               If report_safe is set to true (1), then spam messages are
164               encapsulated into their own message/rfc822 MIME attachment
165               without any modifications being made.
166
167               If report_safe is set to false (0), then the message will only
168               have the above headers added/modified.
169
170       $status->action_depends_on_tags($tags, $code, @args)
171           Enqueue the supplied subroutine reference $code, to become runnable
172           when all the specified tags become available. The $tags may be a
173           simple scalar - a tag name, or a listref of tag names. The
174           subroutine &$code when called will be passed a "permessagestatus"
175           object as its first argument, followed by the supplied (optional)
176           list @args .
177
178       $status->set_tag($tagname, $value)
179           Set a template tag, as used in "add_header", report templates, etc.
180           This API is intended for use by plugins.  Tag names will be
181           converted to an all-uppercase representation internally.
182
183           $value can be a simple scalar (string or number), or a reference to
184           an array, in which case the public method get_tag will join array
185           elements using a space as a separator, returning a single string
186           for backward compatibility.
187
188           $value can also be a subroutine reference, which will be evaluated
189           each time the template is expanded. The first argument passed by
190           get_tag to a called subroutine will be a PerMsgStatus object (this
191           module's object), followed by optional arguments provided a caller
192           to get_tag.
193
194           Note that perl supports closures, which means that variables set in
195           the caller's scope can be accessed inside this "sub". For example:
196
197               my $text = "hello world!";
198               $status->set_tag("FOO", sub {
199                         my $pms = shift;
200                         return $text;
201                       });
202
203           See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
204           details on how template tags are used.
205
206           "undef" will be returned if a tag by that name has not been
207           defined.
208
209       $string = $status->get_tag($tagname)
210           Get the current value of a template tag, as used in "add_header",
211           report templates, etc. This API is intended for use by plugins.
212           Tag names will be converted to an all-uppercase representation
213           internally.  See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
214           section for more details on tags.
215
216           "undef" will be returned if a tag by that name has not been
217           defined.
218
219       $string = $status->get_tag_raw($tagname, @args)
220           Similar to "get_tag", but keeps a tag name unchanged (does not
221           uppercase it), and does not convert arrayref tag values into a
222           single string.
223
224       $status->set_spamd_result_item($subref)
225           Set an entry for the spamd result log line.  $subref should be a
226           code reference for a subroutine which will return a string in
227           'name=VALUE' format, similar to the other entries in the spamd
228           result line:
229
230             Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
231             DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
232             TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
233             TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
234             uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
235             rport=33153,mid=<9PS291LhupY>,autolearn=spam
236
237           "name" and "VALUE" must not contain "=" or "," characters, as it is
238           important that these log lines are easy to parse.
239
240           The code reference will be called by spamd after the message has
241           been scanned, and the "PerMsgStatus::check()" method has returned.
242
243       $status->finish ()
244           Indicate that this $status object is finished with, and can be
245           destroyed.
246
247           If you are using SpamAssassin in a persistent environment, or
248           checking many mail messages from one "Mail::SpamAssassin" factory,
249           this method should be called to ensure Perl's garbage collection
250           will clean up old status objects.
251
252       $name = $status->get_current_eval_rule_name()
253           Return the name of the currently-running eval rule.  "undef" is
254           returned if no eval rule is currently being run.  Useful for
255           plugins to determine the current rule name while inside an eval
256           test function call.
257
258       $status->get_decoded_body_text_array ()
259           Returns the message body, with base64 or quoted-printable encodings
260           decoded, and non-text parts or non-inline attachments stripped.
261
262           This is the same result text as used in 'rawbody' rules.
263
264           It is returned as an array of strings, with each string being a
265           2-4kB chunk of the body, split from boundaries if possible.
266
267       $status->get_decoded_stripped_body_text_array ()
268           Returns the message body, decoded (as described in
269           get_decoded_body_text_array()), with HTML rendered, and with
270           whitespace normalized.
271
272           This is the same result text as used in 'body' rules.
273
274           It will always render text/html, and will use a heuristic to
275           determine if other text/* parts should be considered text/html.
276
277           It is returned as an array of strings, with each string
278           representing one 'paragraph'.  Paragraphs, in plain-text mails, are
279           double-newline-separated blocks of multi-line text.
280
281       $status->get (header_name [, default_value])
282           Returns a message header, pseudo-header, real name or address.
283           "header_name" is the name of a mail header, such as 'Subject',
284           'To', etc.  If "default_value" is given, it will be used if the
285           requested "header_name" does not exist.
286
287           Appending ":raw" to the header name will inhibit decoding of
288           quoted-printable or base-64 encoded strings.
289
290           Appending a modifier ":addr" to a header field name will cause
291           everything except the first email address to be removed from the
292           header field.  It is mainly applicable to header fields 'From',
293           'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
294           the 'Return-Path'. For example, all of the following will result in
295           "example@foo":
296
297           example@foo
298           example@foo (Foo Blah)
299           example@foo, example@bar
300           display: example@foo (Foo Blah), example@bar ;
301           Foo Blah <example@foo>
302           "Foo Blah" <example@foo>
303           "'Foo Blah'" <example@foo>
304
305           Appending a modifier ":name" to a header field name will cause
306           everything except the first display name to be removed from the
307           header field. It is mainly applicable to header fields containing a
308           single mail address: 'From', 'Sender', along with their
309           'Resent-From' and 'Resent-Sender' counterparts.  For example, all
310           of the following will result in "Foo Blah". One level of single
311           quotes is stripped too, as it is often seen.
312
313           example@foo (Foo Blah)
314           example@foo (Foo Blah), example@bar
315           display: example@foo (Foo Blah), example@bar ;
316           Foo Blah <example@foo>
317           "Foo Blah" <example@foo>
318           "'Foo Blah'" <example@foo>
319
320           There are several special pseudo-headers that can be specified:
321
322           "ALL" can be used to mean the text of all the message's headers.
323           "ALL-TRUSTED" can be used to mean the text of all the message's
324           headers that could only have been added by trusted relays.
325           "ALL-INTERNAL" can be used to mean the text of all the message's
326           headers that could only have been added by internal relays.
327           "ALL-UNTRUSTED" can be used to mean the text of all the message's
328           headers that may have been added by untrusted relays.  To make this
329           pseudo-header more useful for header rules the 'Received' header
330           that was added by the last trusted relay is included, even though
331           it can be trusted.
332           "ALL-EXTERNAL" can be used to mean the text of all the message's
333           headers that may have been added by external relays.  Like
334           "ALL-UNTRUSTED" the 'Received' header added by the last internal
335           relay is included.
336           "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
337           headers.
338           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
339           SMTP transaction that delivered this message, if this data has been
340           made available by the SMTP server.
341           "MESSAGEID" is a symbol meaning all Message-Id's found in the
342           message; some mailing list software moves the real 'Message-Id' to
343           'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
344           'Message-Id' header.  The value returned for this symbol is the
345           text from all 3 headers, separated by newlines.
346           "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
347           relays the message has passed through
348           "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
349           the message has passed through
350       $status->get_uri_list ()
351           Returns an array of all unique URIs found in the message.  It takes
352           a combination of the URIs found in the rendered (decoded and HTML
353           stripped) body and the URIs found when parsing the HTML in the
354           message.  Will also set $status->{uri_list} (the array as returned
355           by this function).
356
357           The returned array will include the "raw" URI as well as "slightly
358           cooked" versions.  For example, the single URI
359           'http://%77&#00119;%77.example.com/' will get turned into: (
360           'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )
361
362       $status->get_uri_detail_list ()
363           Returns a hash reference of all unique URIs found in the message
364           and various data about where the URIs were found in the message.
365           It takes a combination of the URIs found in the rendered (decoded
366           and HTML stripped) body and the URIs found when parsing the HTML in
367           the message.  Will also set $status->{uri_detail_list} (the hash
368           reference as returned by this function).  This function will also
369           set $status->{uri_domain_count} (count of unique domains).
370
371           The hash format looks something like this:
372
373             raw_uri => {
374               types => { a => 1, img => 1, parsed => 1 },
375               cleaned => [ canonicalized_uri ],
376               anchor_text => [ "click here", "no click here" ],
377               domains => { domain1 => 1, domain2 => 1 },
378             }
379
380           "raw_uri" is whatever the URI was in the message itself
381           (http://spamassassin.apache%2Eorg/).
382
383           "types" is a hash of the HTML tags (lowercase) which referenced the
384           raw_uri.  parsed is a faked type which specifies that the raw_uri
385           was seen in the rendered text.
386
387           "cleaned" is an array of the raw and canonicalized version of the
388           raw_uri (http://spamassassin.apache%2Eorg/,
389           http://spamassassin.apache.org/).
390
391           "anchor_text" is an array of the anchor text (text between <a> and
392           </a>), if any, which linked to the URI.
393
394           "domains" is a hash of the domains found in the canonicalized URIs.
395
396           "hosts" is a hash of unstripped hostnames found in the
397           canonicalized URIs as hash keys, with their domain part stored as a
398           value of each hash entry.
399
400       $status->clear_test_state()
401           Clear test state, including test log messages from
402           "$status->test_log()".
403
404       $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
405           Register a hit against a rule in the ruleset.
406
407           There are two mandatory arguments. These are $rulename, the name of
408           the rule that fired, and $desc_prepend, which is a short string
409           that will be prepended to the rules "describe" string in output
410           reports.
411
412           In addition, callers can supplement that with the following
413           optional data:
414
415           score => $num
416               Optional: the score to use for the rule hit.  If unspecified,
417               the value from the "Mail::SpamAssassin::Conf" object's
418               "{scores}" hash will be used (a configured score), and in its
419               absence the "defscore" option value.
420
421           defscore => $num
422               Optional: the score to use for the rule hit if neither the
423               option "score" is provided, nor a configured score value is
424               provided.
425
426           value => $num
427               Optional: the value to assign to the rule; the default value is
428               1.  tflags multiple rules use values of greater than 1 to
429               indicate multiple hits.  This value is accessible to meta
430               rules.
431
432           ruletype => $type
433               Optional, but recommended: the rule type string.  This is used
434               in the "hit_rule" plugin call, called by this method.  If
435               unset, 'unknown' is used.
436
437           tflags => $string
438               Optional: a string, i.e. a space-separated list of additional
439               tflags to be appended to an existing list of flags in
440               $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
441               No syntax checks are performed.
442
443           description => $string
444               Optional: a custom rule description string.  This is used in
445               the "hit_rule" plugin call, called by this method. If unset,
446               the static description is used.
447
448           Backward compatibility: the two mandatory arguments have been part
449           of this API since SpamAssassin 2.x.  The optional name=<gtvalue>
450           pairs, however, are a new addition in SpamAssassin 3.2.0.
451
452       $status->create_fulltext_tmpfile (fulltext_ref)
453           This function creates a temporary file containing the passed scalar
454           reference data (typically the full/pristine text of the message).
455           This is typically used by external programs like pyzor and dccproc,
456           to avoid hangs due to buffering issues.   Methods that need this,
457           should call $self->create_fulltext_tmpfile($fulltext) to retrieve
458           the temporary filename; it will be created if it has not already
459           been.
460
461           Note: This can only be called once until
462           $status->delete_fulltext_tmpfile() is called.
463
464       $status->delete_fulltext_tmpfile ()
465           Will cleanup after a $status->create_fulltext_tmpfile() call.
466           Deletes the temporary file and uncaches the filename.
467
468       all_from_addrs_domains
469           This function returns all the various from addresses in a message
470           using all_from_addrs() and then returns only the domain names.
471

NAME

SYNOPSIS

DESCRIPTION

METHODS

SEE ALSO