1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4
6 Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7 not-spam)
8
10 my $spamtest = new Mail::SpamAssassin ({
11 'rules_filename' => '/etc/spamassassin.rules',
12 'userprefs_filename' => $ENV{HOME}.'/.spamassassin/user_prefs'
13 });
14 my $mail = $spamtest->parse();
15
16 my $status = $spamtest->check ($mail);
17
18 my $rewritten_mail;
19 if ($status->is_spam()) {
20 $rewritten_mail = $status->rewrite_mail ();
21 }
22 ...
23
25 The Mail::SpamAssassin "check()" method returns an object of this
26 class. This object encapsulates all the per-message state.
27
29 $status->check ()
30 Runs the SpamAssassin rules against the message pointed to by the
31 object.
32
33 $status->learn()
34 After a mail message has been checked, this method can be called.
35 If the score is outside a certain range around the threshold, ie.
36 if the message is judged more-or-less definitely spam or definitely
37 non-spam, it will be fed into SpamAssassin's learning systems
38 (currently the naive Bayesian classifier), so that future similar
39 mails will be caught.
40
41 $score = $status->get_autolearn_points()
42 Return the message's score as computed for auto-learning. Certain
43 tests are ignored:
44
45 - rules with tflags set to 'learn' (the Bayesian rules)
46
47 - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49 - rules with tflags set to 'noautolearn'
50
51 Also note that auto-learning occurs using scores from either
52 scoreset 0 or 1, depending on what scoreset is used during message
53 check. It is likely that the message check and auto-learn scores
54 will be different.
55
56 $score = $status->get_head_only_points()
57 Return the message's score as computed for auto-learning, ignoring
58 all rules except for header-based ones.
59
60 $score = $status->get_learned_points()
61 Return the message's score as computed for auto-learning, ignoring
62 all rules except for learning-based ones.
63
64 $score = $status->get_body_only_points()
65 Return the message's score as computed for auto-learning, ignoring
66 all rules except for body-based ones.
67
68 $score = $status->get_autolearn_force_status()
69 Return whether a message's score included any rules that are
70 flagged as autolearn_force.
71
72 $rule_names = $status->get_autolearn_force_names()
73 Return a list of comma separated list of rule names if a message's
74 score included any rules that are flagged as autolearn_force.
75
76 $isspam = $status->is_spam ()
77 After a mail message has been checked, this method can be called.
78 It will return 1 for mail determined likely to be spam, 0 if it
79 does not seem spam-like.
80
81 $list = $status->get_names_of_tests_hit ()
82 After a mail message has been checked, this method can be called.
83 It will return a comma-separated string, listing all the symbolic
84 test names of the tests which were triggered by the mail.
85
86 $list = $status->get_names_of_tests_hit_with_scores_hash ()
87 After a mail message has been checked, this method can be called.
88 It will return a pointer to a hash for rule & score pairs for all
89 the symbolic test names and individual scores of the tests which
90 were triggered by the mail.
91
92 $list = $status->get_names_of_tests_hit_with_scores ()
93 After a mail message has been checked, this method can be called.
94 It will return a comma-separated string of rule=score pairs for all
95 the symbolic test names and individual scores of the tests which
96 were triggered by the mail.
97
98 $list = $status->get_names_of_subtests_hit ()
99 After a mail message has been checked, this method can be called.
100 It will return a comma-separated string, listing all the symbolic
101 test names of the meta-rule sub-tests which were triggered by the
102 mail. Sub-tests are the normally-hidden rules, which score 0 and
103 have names beginning with two underscores, used in meta rules.
104
105 $num = $status->get_score ()
106 After a mail message has been checked, this method can be called.
107 It will return the message's score.
108
109 $num = $status->get_required_score ()
110 After a mail message has been checked, this method can be called.
111 It will return the score required for a mail to be considered spam.
112
113 $num = $status->get_autolearn_status ()
114 After a mail message has been checked, this method can be called.
115 It will return one of the following strings depending on whether
116 the mail was auto-learned or not: "ham", "no", "spam", "disabled",
117 "failed", "unavailable".
118
119 It also returns is flagged with auto_learn_force, it will also
120 include the status and the rules hit. For example:
121 "autolearn_force=yes (AUTOLEARNTEST_BODY)"
122
123 $report = $status->get_report ()
124 Deliver a "spam report" on the checked mail message. This contains
125 details of how many spam detection rules it triggered.
126
127 The report is returned as a multi-line string, with the lines
128 separated by "\n" characters.
129
130 $preview = $status->get_content_preview ()
131 Give a "preview" of the content.
132
133 This is returned as a multi-line string, with the lines separated
134 by "\n" characters, containing a fully-decoded, safe, plain-text
135 sample of the first few lines of the message body.
136
137 $msg = $status->get_message()
138 Return the object representing the message being scanned.
139
140 $status->rewrite_mail ()
141 Rewrite the mail message. This will at minimum add headers, and at
142 maximum MIME-encapsulate the message text, to reflect its spam or
143 not-spam status. The function will return a scalar of the
144 rewritten message.
145
146 The actual modifications depend on the configuration (see
147 "Mail::SpamAssassin::Conf" for more information).
148
149 The possible modifications are as follows:
150
151 To:, From: and Subject: modification on spam mails
152 Depending on the configuration, the To: and From: lines can
153 have a user-defined RFC 2822 comment appended for spam mail.
154 The subject line may have a user-defined string prepended to it
155 for spam mail.
156
157 X-Spam-* headers for all mails
158 Depending on the configuration, zero or more headers with names
159 beginning with "X-Spam-" will be added to mail depending on
160 whether it is spam or ham.
161
162 spam message with report_safe
163 If report_safe is set to true (1), then spam messages are
164 encapsulated into their own message/rfc822 MIME attachment
165 without any modifications being made.
166
167 If report_safe is set to false (0), then the message will only
168 have the above headers added/modified.
169
170 $status->action_depends_on_tags($tags, $code, @args)
171 Enqueue the supplied subroutine reference $code, to become runnable
172 when all the specified tags become available. The $tags may be a
173 simple scalar - a tag name, or a listref of tag names. The
174 subroutine &$code when called will be passed a "permessagestatus"
175 object as its first argument, followed by the supplied (optional)
176 list @args .
177
178 $status->set_tag($tagname, $value)
179 Set a template tag, as used in "add_header", report templates, etc.
180 This API is intended for use by plugins. Tag names will be
181 converted to an all-uppercase representation internally.
182
183 $value can be a simple scalar (string or number), or a reference to
184 an array, in which case the public method get_tag will join array
185 elements using a space as a separator, returning a single string
186 for backward compatibility.
187
188 $value can also be a subroutine reference, which will be evaluated
189 each time the template is expanded. The first argument passed by
190 get_tag to a called subroutine will be a PerMsgStatus object (this
191 module's object), followed by optional arguments provided a caller
192 to get_tag.
193
194 Note that perl supports closures, which means that variables set in
195 the caller's scope can be accessed inside this "sub". For example:
196
197 my $text = "hello world!";
198 $status->set_tag("FOO", sub {
199 my $pms = shift;
200 return $text;
201 });
202
203 See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
204 details on how template tags are used.
205
206 "undef" will be returned if a tag by that name has not been
207 defined.
208
209 $string = $status->get_tag($tagname)
210 Get the current value of a template tag, as used in "add_header",
211 report templates, etc. This API is intended for use by plugins.
212 Tag names will be converted to an all-uppercase representation
213 internally. See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
214 section for more details on tags.
215
216 "undef" will be returned if a tag by that name has not been
217 defined.
218
219 $string = $status->get_tag_raw($tagname, @args)
220 Similar to "get_tag", but keeps a tag name unchanged (does not
221 uppercase it), and does not convert arrayref tag values into a
222 single string.
223
224 $status->set_spamd_result_item($subref)
225 Set an entry for the spamd result log line. $subref should be a
226 code reference for a subroutine which will return a string in
227 'name=VALUE' format, similar to the other entries in the spamd
228 result line:
229
230 Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
231 DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
232 TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
233 TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
234 uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
235 rport=33153,mid=<9PS291LhupY>,autolearn=spam
236
237 "name" and "VALUE" must not contain "=" or "," characters, as it is
238 important that these log lines are easy to parse.
239
240 The code reference will be called by spamd after the message has
241 been scanned, and the "PerMsgStatus::check()" method has returned.
242
243 $status->finish ()
244 Indicate that this $status object is finished with, and can be
245 destroyed.
246
247 If you are using SpamAssassin in a persistent environment, or
248 checking many mail messages from one "Mail::SpamAssassin" factory,
249 this method should be called to ensure Perl's garbage collection
250 will clean up old status objects.
251
252 $name = $status->get_current_eval_rule_name()
253 Return the name of the currently-running eval rule. "undef" is
254 returned if no eval rule is currently being run. Useful for
255 plugins to determine the current rule name while inside an eval
256 test function call.
257
258 $status->get_decoded_body_text_array ()
259 Returns the message body, with base64 or quoted-printable encodings
260 decoded, and non-text parts or non-inline attachments stripped.
261
262 It is returned as an array of strings, with each string
263 representing one newline-separated line of the body.
264
265 $status->get_decoded_stripped_body_text_array ()
266 Returns the message body, decoded (as described in
267 get_decoded_body_text_array()), with HTML rendered, and with
268 whitespace normalized.
269
270 It will always render text/html, and will use a heuristic to
271 determine if other text/* parts should be considered text/html.
272
273 It is returned as an array of strings, with each string
274 representing one 'paragraph'. Paragraphs, in plain-text mails, are
275 double-newline-separated blocks of multi-line text.
276
277 $status->get (header_name [, default_value])
278 Returns a message header, pseudo-header, real name or address.
279 "header_name" is the name of a mail header, such as 'Subject',
280 'To', etc. If "default_value" is given, it will be used if the
281 requested "header_name" does not exist.
282
283 Appending ":raw" to the header name will inhibit decoding of
284 quoted-printable or base-64 encoded strings.
285
286 Appending a modifier ":addr" to a header field name will cause
287 everything except the first email address to be removed from the
288 header field. It is mainly applicable to header fields 'From',
289 'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
290 the 'Return-Path'. For example, all of the following will result in
291 "example@foo":
292
293 example@foo
294 example@foo (Foo Blah)
295 example@foo, example@bar
296 display: example@foo (Foo Blah), example@bar ;
297 Foo Blah <example@foo>
298 "Foo Blah" <example@foo>
299 "'Foo Blah'" <example@foo>
300
301 Appending a modifier ":name" to a header field name will cause
302 everything except the first display name to be removed from the
303 header field. It is mainly applicable to header fields containing a
304 single mail address: 'From', 'Sender', along with their
305 'Resent-From' and 'Resent-Sender' counterparts. For example, all
306 of the following will result in "Foo Blah". One level of single
307 quotes is stripped too, as it is often seen.
308
309 example@foo (Foo Blah)
310 example@foo (Foo Blah), example@bar
311 display: example@foo (Foo Blah), example@bar ;
312 Foo Blah <example@foo>
313 "Foo Blah" <example@foo>
314 "'Foo Blah'" <example@foo>
315
316 There are several special pseudo-headers that can be specified:
317
318 "ALL" can be used to mean the text of all the message's headers.
319 "ALL-TRUSTED" can be used to mean the text of all the message's
320 headers that could only have been added by trusted relays.
321 "ALL-INTERNAL" can be used to mean the text of all the message's
322 headers that could only have been added by internal relays.
323 "ALL-UNTRUSTED" can be used to mean the text of all the message's
324 headers that may have been added by untrusted relays. To make this
325 pseudo-header more useful for header rules the 'Received' header
326 that was added by the last trusted relay is included, even though
327 it can be trusted.
328 "ALL-EXTERNAL" can be used to mean the text of all the message's
329 headers that may have been added by external relays. Like
330 "ALL-UNTRUSTED" the 'Received' header added by the last internal
331 relay is included.
332 "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
333 headers.
334 "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
335 SMTP transaction that delivered this message, if this data has been
336 made available by the SMTP server.
337 "MESSAGEID" is a symbol meaning all Message-Id's found in the
338 message; some mailing list software moves the real 'Message-Id' to
339 'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
340 'Message-Id' header. The value returned for this symbol is the
341 text from all 3 headers, separated by newlines.
342 "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
343 relays the message has passed through
344 "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
345 the message has passed through
346 $status->get_uri_list ()
347 Returns an array of all unique URIs found in the message. It takes
348 a combination of the URIs found in the rendered (decoded and HTML
349 stripped) body and the URIs found when parsing the HTML in the
350 message. Will also set $status->{uri_list} (the array as returned
351 by this function).
352
353 The returned array will include the "raw" URI as well as "slightly
354 cooked" versions. For example, the single URI
355 'http://%77w%77.example.com/' will get turned into: (
356 'http://%77w%77.example.com/', 'http://www.example.com/' )
357
358 $status->get_uri_detail_list ()
359 Returns a hash reference of all unique URIs found in the message
360 and various data about where the URIs were found in the message.
361 It takes a combination of the URIs found in the rendered (decoded
362 and HTML stripped) body and the URIs found when parsing the HTML in
363 the message. Will also set $status->{uri_detail_list} (the hash
364 reference as returned by this function). This function will also
365 set $status->{uri_domain_count} (count of unique domains).
366
367 The hash format looks something like this:
368
369 raw_uri => {
370 types => { a => 1, img => 1, parsed => 1 },
371 cleaned => [ canonicalized_uri ],
372 anchor_text => [ "click here", "no click here" ],
373 domains => { domain1 => 1, domain2 => 1 },
374 }
375
376 "raw_uri" is whatever the URI was in the message itself
377 (http://spamassassin.apache%2Eorg/).
378
379 "types" is a hash of the HTML tags (lowercase) which referenced the
380 raw_uri. parsed is a faked type which specifies that the raw_uri
381 was seen in the rendered text.
382
383 "cleaned" is an array of the raw and canonicalized version of the
384 raw_uri (http://spamassassin.apache%2Eorg/,
385 http://spamassassin.apache.org/).
386
387 "anchor_text" is an array of the anchor text (text between <a> and
388 </a>), if any, which linked to the URI.
389
390 "domains" is a hash of the domains found in the canonicalized URIs.
391
392 "hosts" is a hash of unstripped hostnames found in the
393 canonicalized URIs as hash keys, with their domain part stored as a
394 value of each hash entry.
395
396 $status->clear_test_state()
397 Clear test state, including test log messages from
398 "$status->test_log()".
399
400 $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
401 Register a hit against a rule in the ruleset.
402
403 There are two mandatory arguments. These are $rulename, the name of
404 the rule that fired, and $desc_prepend, which is a short string
405 that will be prepended to the rules "describe" string in output
406 reports.
407
408 In addition, callers can supplement that with the following
409 optional data:
410
411 score => $num
412 Optional: the score to use for the rule hit. If unspecified,
413 the value from the "Mail::SpamAssassin::Conf" object's
414 "{scores}" hash will be used (a configured score), and in its
415 absence the "defscore" option value.
416
417 defscore => $num
418 Optional: the score to use for the rule hit if neither the
419 option "score" is provided, nor a configured score value is
420 provided.
421
422 value => $num
423 Optional: the value to assign to the rule; the default value is
424 1. tflags multiple rules use values of greater than 1 to
425 indicate multiple hits. This value is accessible to meta
426 rules.
427
428 ruletype => $type
429 Optional, but recommended: the rule type string. This is used
430 in the "hit_rule" plugin call, called by this method. If
431 unset, 'unknown' is used.
432
433 tflags => $string
434 Optional: a string, i.e. a space-separated list of additional
435 tflags to be appended to an existing list of flags in
436 $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
437 No syntax checks are performed.
438
439 description => $string
440 Optional: a custom rule description string. This is used in
441 the "hit_rule" plugin call, called by this method. If unset,
442 the static description is used.
443
444 Backward compatibility: the two mandatory arguments have been part
445 of this API since SpamAssassin 2.x. The optional name=<gtvalue>
446 pairs, however, are a new addition in SpamAssassin 3.2.0.
447
448 $status->create_fulltext_tmpfile (fulltext_ref)
449 This function creates a temporary file containing the passed scalar
450 reference data (typically the full/pristine text of the message).
451 This is typically used by external programs like pyzor and dccproc,
452 to avoid hangs due to buffering issues. Methods that need this,
453 should call $self->create_fulltext_tmpfile($fulltext) to retrieve
454 the temporary filename; it will be created if it has not already
455 been.
456
457 Note: This can only be called once until
458 $status->delete_fulltext_tmpfile() is called.
459
460 $status->delete_fulltext_tmpfile ()
461 Will cleanup after a $status->create_fulltext_tmpfile() call.
462 Deletes the temporary file and uncaches the filename.
463
464 all_from_addrs_domains
465 This function returns all the various from addresses in a message
466 using all_from_addrs() and then returns only the domain names.
467
469 "Mail::SpamAssassin" "spamassassin"
470
471
472
473perl v5.28.0 2018-09-14Mail::SpamAssassin::PerMsgStatus(3)