Mail::SpamAssassin::PerMsgStatus(3pm)

1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4

NAME

6       Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7       not-spam)
8

SYNOPSIS

10         my $spamtest = new Mail::SpamAssassin ({
11           'rules_filename'      => '/etc/spamassassin.rules',
12           'userprefs_filename'  => $ENV{HOME}.'/.spamassassin/user_prefs'
13         });
14         my $mail = $spamtest->parse();
15
16         my $status = $spamtest->check ($mail);
17
18         my $rewritten_mail;
19         if ($status->is_spam()) {
20           $rewritten_mail = $status->rewrite_mail ();
21         }
22         ...
23

DESCRIPTION

25       The Mail::SpamAssassin "check()" method returns an object of this
26       class.  This object encapsulates all the per-message state.
27

METHODS

29       $status->check ()
30           Runs the SpamAssassin rules against the message pointed to by the
31           object.
32
33       $status->learn()
34           After a mail message has been checked, this method can be called.
35           If the score is outside a certain range around the threshold, ie.
36           if the message is judged more-or-less definitely spam or definitely
37           non-spam, it will be fed into SpamAssassin's learning systems
38           (currently the naive Bayesian classifier), so that future similar
39           mails will be caught.
40
41       $score = $status->get_autolearn_points()
42           Return the message's score as computed for auto-learning.  Certain
43           tests are ignored:
44
45             - rules with tflags set to 'learn' (the Bayesian rules)
46
47             - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49             - rules with tflags set to 'noautolearn'
50
51           Also note that auto-learning occurs using scores from either
52           scoreset 0 or 1, depending on what scoreset is used during message
53           check.  It is likely that the message check and auto-learn scores
54           will be different.
55
56       $score = $status->get_head_only_points()
57           Return the message's score as computed for auto-learning, ignoring
58           all rules except for header-based ones.
59
60       $score = $status->get_learned_points()
61           Return the message's score as computed for auto-learning, ignoring
62           all rules except for learning-based ones.
63
64       $score = $status->get_body_only_points()
65           Return the message's score as computed for auto-learning, ignoring
66           all rules except for body-based ones.
67
68       $isspam = $status->is_spam ()
69           After a mail message has been checked, this method can be called.
70           It will return 1 for mail determined likely to be spam, 0 if it
71           does not seem spam-like.
72
73       $list = $status->get_names_of_tests_hit ()
74           After a mail message has been checked, this method can be called.
75           It will return a comma-separated string, listing all the symbolic
76           test names of the tests which were trigged by the mail.
77
78       $list = $status->get_names_of_subtests_hit ()
79           After a mail message has been checked, this method can be called.
80           It will return a comma-separated string, listing all the symbolic
81           test names of the meta-rule sub-tests which were trigged by the
82           mail.  Sub-tests are the normally-hidden rules, which score 0 and
83           have names beginning with two underscores, used in meta rules.
84
85       $num = $status->get_score ()
86           After a mail message has been checked, this method can be called.
87           It will return the message's score.
88
89       $num = $status->get_required_score ()
90           After a mail message has been checked, this method can be called.
91           It will return the score required for a mail to be considered spam.
92
93       $num = $status->get_autolearn_status ()
94           After a mail message has been checked, this method can be called.
95           It will return one of the following strings depending on whether
96           the mail was auto-learned or not: "ham", "no", "spam", "disabled",
97           "failed", "unavailable".
98
99       $report = $status->get_report ()
100           Deliver a "spam report" on the checked mail message.  This contains
101           details of how many spam detection rules it triggered.
102
103           The report is returned as a multi-line string, with the lines
104           separated by "\n" characters.
105
106       $preview = $status->get_content_preview ()
107           Give a "preview" of the content.
108
109           This is returned as a multi-line string, with the lines separated
110           by "\n" characters, containing a fully-decoded, safe, plain-text
111           sample of the first few lines of the message body.
112
113       $msg = $status->get_message()
114           Return the object representing the message being scanned.
115
116       $status->rewrite_mail ()
117           Rewrite the mail message.  This will at minimum add headers, and at
118           maximum MIME-encapsulate the message text, to reflect its spam or
119           not-spam status.  The function will return a scalar of the
120           rewritten message.
121
122           The actual modifications depend on the configuration (see
123           "Mail::SpamAssassin::Conf" for more information).
124
125           The possible modifications are as follows:
126
127           To:, From: and Subject: modification on spam mails
128               Depending on the configuration, the To: and From: lines can
129               have a user-defined RFC 2822 comment appended for spam mail.
130               The subject line may have a user-defined string prepended to it
131               for spam mail.
132
133           X-Spam-* headers for all mails
134               Depending on the configuration, zero or more headers with names
135               beginning with "X-Spam-" will be added to mail depending on
136               whether it is spam or ham.
137
138           spam message with report_safe
139               If report_safe is set to true (1), then spam messages are
140               encapsulated into their own message/rfc822 MIME attachment
141               without any modifications being made.
142
143               If report_safe is set to false (0), then the message will only
144               have the above headers added/modified.
145
146       $status->set_tag($tagname, $value)
147           Set a template tag, as used in "add_header", report templates, etc.
148           This API is intended for use by plugins.   Tag names will be
149           converted to an all-uppercase representation internally.
150
151           $value can be a subroutine reference, which will be evaluated each
152           time the template is expanded.  Note that perl supports closures,
153           which means that variables set in the caller's scope can be
154           accessed inside this "sub".  For example:
155
156               my $text = "hello world!";
157               $status->set_tag("FOO", sub {
158                         return $text;
159                       });
160
161           See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
162           details on how template tags are used.
163
164           "undef" will be returned if a tag by that name has not been
165           defined.
166
167       $string = $status->get_tag($tagname)
168           Get the current value of a template tag, as used in "add_header",
169           report templates, etc. This API is intended for use by plugins.
170           Tag names will be converted to an all-uppercase representation
171           internally.  See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
172           section for more details on tags.
173
174           "undef" will be returned if a tag by that name has not been
175           defined.
176
177       $status->set_spamd_result_item($subref)
178           Set an entry for the spamd result log line.  $subref should be a
179           code reference for a subroutine which will return a string in
180           'name=VALUE' format, similar to the other entries in the spamd
181           result line:
182
183             Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
184             DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
185             TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
186             TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
187             uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
188             rport=33153,mid=<9PS291LhupY>,autolearn=spam
189
190           "name" and "VALUE" must not contain "=" or "," characters, as it is
191           important that these log lines are easy to parse.
192
193           The code reference will be called by spamd after the message has
194           been scanned, and the "PerMsgStatus::check()" method has returned.
195
196       $status->finish ()
197           Indicate that this $status object is finished with, and can be
198           destroyed.
199
200           If you are using SpamAssassin in a persistent environment, or
201           checking many mail messages from one "Mail::SpamAssassin" factory,
202           this method should be called to ensure Perl's garbage collection
203           will clean up old status objects.
204
205       $name = $status->get_current_eval_rule_name()
206           Return the name of the currently-running eval rule.  "undef" is
207           returned if no eval rule is currently being run.  Useful for
208           plugins to determine the current rule name while inside an eval
209           test function call.
210
211       $status->get_decoded_body_text_array ()
212           Returns the message body, with base64 or quoted-printable encodings
213           decoded, and non-text parts or non-inline attachments stripped.
214
215           It is returned as an array of strings, with each string
216           representing one newline-separated line of the body.
217
218       $status->get_decoded_stripped_body_text_array ()
219           Returns the message body, decoded (as described in
220           get_decoded_body_text_array()), with HTML rendered, and with
221           whitespace normalized.
222
223           It will always render text/html, and will use a heuristic to
224           determine if other text/* parts should be considered text/html.
225
226           It is returned as an array of strings, with each string
227           representing one 'paragraph'.  Paragraphs, in plain-text mails, are
228           double-newline-separated blocks of multi-line text.
229
230       $status->get (header_name [, default_value])
231           Returns a message header, pseudo-header, real name or address.
232           "header_name" is the name of a mail header, such as 'Subject',
233           'To', etc.  If "default_value" is given, it will be used if the
234           requested "header_name" does not exist.
235
236           Appending ":raw" to the header name will inhibit decoding of
237           quoted-printable or base-64 encoded strings.
238
239           Appending ":addr" to the header name will cause everything except
240           the first email address to be removed from the header.  For
241           example, all of the following will result in "example@foo":
242
243           example@foo
244           example@foo (Foo Blah)
245           example@foo, example@bar
246           display: example@foo (Foo Blah), example@bar ;
247           Foo Blah <example@foo>
248           "Foo Blah" <example@foo>
249           "'Foo Blah'" <example@foo>
250
251           Appending ":name" to the header name will cause everything except
252           the first display name to be removed from the header.  For example,
253           all of the following will result in "Foo Blah"
254
255           example@foo (Foo Blah)
256           example@foo (Foo Blah), example@bar
257           display: example@foo (Foo Blah), example@bar ;
258           Foo Blah <example@foo>
259           "Foo Blah" <example@foo>
260           "'Foo Blah'" <example@foo>
261
262           There are several special pseudo-headers that can be specified:
263
264           "ALL" can be used to mean the text of all the message's headers.
265           "ALL-TRUSTED" can be used to mean the text of all the message's
266           headers that could only have been added by trusted relays.
267           "ALL-INTERNAL" can be used to mean the text of all the message's
268           headers that could only have been added by internal relays.
269           "ALL-UNTRUSTED" can be used to mean the text of all the message's
270           headers that may have been added by untrusted relays.  To make this
271           pseudo-header more useful for header rules the 'Received' header
272           that was added by the last trusted relay is included, even though
273           it can be trusted.
274           "ALL-EXTERNAL" can be used to mean the text of all the message's
275           headers that may have been added by external relays.  Like
276           "ALL-UNTRUSTED" the 'Received' header added by the last internal
277           relay is included.
278           "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
279           headers.
280           "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
281           SMTP transaction that delivered this message, if this data has been
282           made available by the SMTP server.
283           "MESSAGEID" is a symbol meaning all Message-Id's found in the
284           message; some mailing list software moves the real 'Message-Id' to
285           'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
286           'Message-Id' header.  The value returned for this symbol is the
287           text from all 3 headers, separated by newlines.
288           "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
289           relays the message has passed through
290           "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
291           the message has passed through
292       $status->get_uri_list ()
293           Returns an array of all unique URIs found in the message.  It takes
294           a combination of the URIs found in the rendered (decoded and HTML
295           stripped) body and the URIs found when parsing the HTML in the
296           message.  Will also set $status->{uri_list} (the array as returned
297           by this function).
298
299           The returned array will include the "raw" URI as well as "slightly
300           cooked" versions.  For example, the single URI
301           'http://%77&#00119;%77.example.com/' will get turned into: (
302           'http://%77&#00119;%77.example.com/', 'http://www.example.com/' )
303
304       $status->get_uri_detail_list ()
305           Returns a hash reference of all unique URIs found in the message
306           and various data about where the URIs were found in the message.
307           It takes a combination of the URIs found in the rendered (decoded
308           and HTML stripped) body and the URIs found when parsing the HTML in
309           the message.  Will also set $status->{uri_detail_list} (the hash
310           reference as returned by this function).  This function will also
311           set $status->{uri_domain_count} (count of unique domains).
312
313           The hash format looks something like this:
314
315             raw_uri => {
316               types => { a => 1, img => 1, parsed => 1 },
317               cleaned => [ canonified_uri ],
318               anchor_text => [ "click here", "no click here" ],
319               domains => { domain1 => 1, domain2 => 1 },
320             }
321
322           "raw_uri" is whatever the URI was in the message itself
323           (http://spamassassin.apache%2Eorg/).
324
325           "types" is a hash of the HTML tags (lowercase) which referenced the
326           raw_uri.  parsed is a faked type which specifies that the raw_uri
327           was seen in the rendered text.
328
329           "cleaned" is an array of the raw and canonified version of the
330           raw_uri (http://spamassassin.apache%2Eorg/,
331           http://spamassassin.apache.org/).
332
333           "anchor_text" is an array of the anchor text (text between <a> and
334           </a>), if any, which linked to the URI.
335
336           "domains" is a hash of the domains found in the canonified URIs.
337
338       $status->clear_test_state()
339           Clear test state, including test log messages from
340           "$status->test_log()".
341
342       $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
343           Register a hit against a rule in the ruleset.
344
345           There are two mandatory arguments. These are $rulename, the name of
346           the rule that fired, and $desc_prepend, which is a short string
347           that will be prepended to the rules "describe" string in output
348           reports.
349
350           In addition, callers can supplement that with the following
351           optional data:
352
353           score => $num
354               Optional: the score to use for the rule hit.  If unspecified,
355               the value from the "Mail::SpamAssassin::Conf" object's
356               "{scores}" hash will be used (a configured score), and in its
357               absence the "defscore" option value.
358
359           defscore => $num
360               Optional: the score to use for the rule hit if neither the
361               option "score" is provided, nor a configured score value is
362               provided.
363
364           value => $num
365               Optional: the value to assign to the rule; the default value is
366               1.  tflags multiple rules use values of greater than 1 to
367               indicate multiple hits.  This value is accessible to meta
368               rules.
369
370           ruletype => $type
371               Optional, but recommended: the rule type string.  This is used
372               in the "hit_rule" plugin call, called by this method.  If
373               unset, 'unknown' is used.
374
375           tflags => $string
376               Optional: a string, i.e. a space-separated list of additional
377               tflags to be appended to an existing list of flags in
378               $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
379               No syntax checks are performed.
380
381           description => $string
382               Optional: a custom rule description string.  This is used in
383               the "hit_rule" plugin call, called by this method. If unset,
384               the static description is used.
385
386           Backwards compatibility: the two mandatory arguments have been part
387           of this API since SpamAssassin 2.x.  The optional name=<gtvalue>
388           pairs, however, are a new addition in SpamAssassin 3.2.0.
389
390       $status->create_fulltext_tmpfile (fulltext_ref)
391           This function creates a temporary file containing the passed scalar
392           reference data (typically the full/pristine text of the message).
393           This is typically used by external programs like pyzor and dccproc,
394           to avoid hangs due to buffering issues.   Methods that need this,
395           should call $self->create_fulltext_tmpfile($fulltext) to retrieve
396           the temporary filename; it will be created if it has not already
397           been.
398
399           Note: This can only be called once until
400           $status->delete_fulltext_tmpfile() is called.
401
402       $status->delete_fulltext_tmpfile ()
403           Will cleanup after a $status->create_fulltext_tmpfile() call.
404           Deletes the temporary file and uncaches the filename.
405

NAME

SYNOPSIS

DESCRIPTION

METHODS

SEE ALSO