1Mail::SpamAssassin::PerUMssegrStCaotnutsr(i3b)uted PerlMDaoiclu:m:eSnptaamtAisosnassin::PerMsgStatus(3)
2
3
4
6 Mail::SpamAssassin::PerMsgStatus - per-message status (spam or
7 not-spam)
8
10 my $spamtest = new Mail::SpamAssassin ({
11 'rules_filename' => '/etc/spamassassin.rules',
12 'userprefs_filename' => $ENV{HOME}.'/.spamassassin/user_prefs'
13 });
14 my $mail = $spamtest->parse();
15
16 my $status = $spamtest->check ($mail);
17
18 my $rewritten_mail;
19 if ($status->is_spam()) {
20 $rewritten_mail = $status->rewrite_mail ();
21 }
22 ...
23
25 The Mail::SpamAssassin "check()" method returns an object of this
26 class. This object encapsulates all the per-message state.
27
29 $status->check ()
30 Runs the SpamAssassin rules against the message pointed to by the
31 object.
32
33 $status->learn()
34 After a mail message has been checked, this method can be called.
35 If the score is outside a certain range around the threshold, ie.
36 if the message is judged more-or-less definitely spam or definitely
37 non-spam, it will be fed into SpamAssassin's learning systems
38 (currently the naive Bayesian classifier), so that future similar
39 mails will be caught.
40
41 $score = $status->get_autolearn_points()
42 Return the message's score as computed for auto-learning. Certain
43 tests are ignored:
44
45 - rules with tflags set to 'learn' (the Bayesian rules)
46
47 - rules with tflags set to 'userconf' (user white/black-listing rules, etc)
48
49 - rules with tflags set to 'noautolearn'
50
51 Also note that auto-learning occurs using scores from either
52 scoreset 0 or 1, depending on what scoreset is used during message
53 check. It is likely that the message check and auto-learn scores
54 will be different.
55
56 $score = $status->get_head_only_points()
57 Return the message's score as computed for auto-learning, ignoring
58 all rules except for header-based ones.
59
60 $score = $status->get_learned_points()
61 Return the message's score as computed for auto-learning, ignoring
62 all rules except for learning-based ones.
63
64 $score = $status->get_body_only_points()
65 Return the message's score as computed for auto-learning, ignoring
66 all rules except for body-based ones.
67
68 $score = $status->get_autolearn_force_status()
69 Return whether a message's score included any rules that are
70 flagged as autolearn_force.
71
72 $rule_names = $status->get_autolearn_force_names()
73 Return a list of comma separated list of rule names if a message's
74 score included any rules that are flagged as autolearn_force.
75
76 $isspam = $status->is_spam ()
77 After a mail message has been checked, this method can be called.
78 It will return 1 for mail determined likely to be spam, 0 if it
79 does not seem spam-like.
80
81 $list = $status->get_names_of_tests_hit ()
82 After a mail message has been checked, this method can be called.
83 It will return a comma-separated string, listing all the symbolic
84 test names of the tests which were triggered by the mail.
85
86 $list = $status->get_names_of_tests_hit_with_scores_hash ()
87 After a mail message has been checked, this method can be called.
88 It will return a pointer to a hash for rule & score pairs for all
89 the symbolic test names and individual scores of the tests which
90 were triggered by the mail.
91
92 $list = $status->get_names_of_tests_hit_with_scores ()
93 After a mail message has been checked, this method can be called.
94 It will return a comma-separated string of rule=score pairs for all
95 the symbolic test names and individual scores of the tests which
96 were triggered by the mail.
97
98 $list = $status->get_names_of_subtests_hit ()
99 After a mail message has been checked, this method can be called.
100 It will return a comma-separated string, listing all the symbolic
101 test names of the meta-rule sub-tests which were triggered by the
102 mail. Sub-tests are the normally-hidden rules, which score 0 and
103 have names beginning with two underscores, used in meta rules.
104
105 If a parameter of collapsed or dbg is passed, the output will be a
106 condensed array of sub-tests with multiple hits reduced to one
107 entry.
108
109 If the parameter of dbg is passed, the output will be a condensed
110 string of sub-tests with multiple hits reduced to one entry with
111 the number of hits in parentheses. Some information is also added
112 at the end regarding the multiple hits.
113
114 $num = $status->get_score ()
115 After a mail message has been checked, this method can be called.
116 It will return the message's score.
117
118 $num = $status->get_required_score ()
119 After a mail message has been checked, this method can be called.
120 It will return the score required for a mail to be considered spam.
121
122 $num = $status->get_autolearn_status ()
123 After a mail message has been checked, this method can be called.
124 It will return one of the following strings depending on whether
125 the mail was auto-learned or not: "ham", "no", "spam", "disabled",
126 "failed", "unavailable".
127
128 It also returns is flagged with auto_learn_force, it will also
129 include the status and the rules hit. For example:
130 "autolearn_force=yes (AUTOLEARNTEST_BODY)"
131
132 $report = $status->get_report ()
133 Deliver a "spam report" on the checked mail message. This contains
134 details of how many spam detection rules it triggered.
135
136 The report is returned as a multi-line string, with the lines
137 separated by "\n" characters.
138
139 $preview = $status->get_content_preview ()
140 Give a "preview" of the content.
141
142 This is returned as a multi-line string, with the lines separated
143 by "\n" characters, containing a fully-decoded, safe, plain-text
144 sample of the first few lines of the message body.
145
146 $msg = $status->get_message()
147 Return the object representing the message being scanned.
148
149 $status->rewrite_mail ()
150 Rewrite the mail message. This will at minimum add headers, and at
151 maximum MIME-encapsulate the message text, to reflect its spam or
152 not-spam status. The function will return a scalar of the
153 rewritten message.
154
155 The actual modifications depend on the configuration (see
156 "Mail::SpamAssassin::Conf" for more information).
157
158 The possible modifications are as follows:
159
160 To:, From: and Subject: modification on spam mails
161 Depending on the configuration, the To: and From: lines can
162 have a user-defined RFC 2822 comment appended for spam mail.
163 The subject line may have a user-defined string prepended to it
164 for spam mail.
165
166 X-Spam-* headers for all mails
167 Depending on the configuration, zero or more headers with names
168 beginning with "X-Spam-" will be added to mail depending on
169 whether it is spam or ham.
170
171 spam message with report_safe
172 If report_safe is set to true (1), then spam messages are
173 encapsulated into their own message/rfc822 MIME attachment
174 without any modifications being made.
175
176 If report_safe is set to false (0), then the message will only
177 have the above headers added/modified.
178
179 $status->action_depends_on_tags($tags, $code, @args)
180 Enqueue the supplied subroutine reference $code, to become runnable
181 when all the specified tags become available. The $tags may be a
182 simple scalar - a tag name, or a listref of tag names. The
183 subroutine &$code when called will be passed a "permessagestatus"
184 object as its first argument, followed by the supplied (optional)
185 list @args .
186
187 $status->set_tag($tagname, $value)
188 Set a template tag, as used in "add_header", report templates, etc.
189 This API is intended for use by plugins. Tag names will be
190 converted to an all-uppercase representation internally.
191
192 $value can be a simple scalar (string or number), or a reference to
193 an array, in which case the public method get_tag will join array
194 elements using a space as a separator, returning a single string
195 for backward compatibility.
196
197 $value can also be a subroutine reference, which will be evaluated
198 each time the template is expanded. The first argument passed by
199 get_tag to a called subroutine will be a PerMsgStatus object (this
200 module's object), followed by optional arguments provided a caller
201 to get_tag.
202
203 Note that perl supports closures, which means that variables set in
204 the caller's scope can be accessed inside this "sub". For example:
205
206 my $text = "hello world!";
207 $status->set_tag("FOO", sub {
208 my $pms = shift;
209 return $text;
210 });
211
212 See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS" section for more
213 details on how template tags are used.
214
215 "undef" will be returned if a tag by that name has not been
216 defined.
217
218 $string = $status->get_tag($tagname)
219 Get the current value of a template tag, as used in "add_header",
220 report templates, etc. This API is intended for use by plugins.
221 Tag names will be converted to an all-uppercase representation
222 internally. See "Mail::SpamAssassin::Conf"'s "TEMPLATE TAGS"
223 section for more details on tags.
224
225 "undef" will be returned if a tag by that name has not been
226 defined.
227
228 $string = $status->get_tag_raw($tagname, @args)
229 Similar to "get_tag", but keeps a tag name unchanged (does not
230 uppercase it), and does not convert arrayref tag values into a
231 single string.
232
233 $status->set_spamd_result_item($subref)
234 Set an entry for the spamd result log line. $subref should be a
235 code reference for a subroutine which will return a string in
236 'name=VALUE' format, similar to the other entries in the spamd
237 result line:
238
239 Jul 17 14:10:47 radish spamd[16670]: spamd: result: Y 22 - ALL_NATURAL,
240 DATE_IN_FUTURE_03_06,DIET_1,DRUGS_ERECTILE,DRUGS_PAIN,
241 TEST_FORGED_YAHOO_RCVD,TEST_INVALID_DATE,TEST_NOREALNAME,
242 TEST_NORMAL_HTTP_TO_IP,UNDISC_RECIPS scantime=0.4,size=3138,user=jm,
243 uid=1000,required_score=5.0,rhost=localhost,raddr=127.0.0.1,
244 rport=33153,mid=<9PS291LhupY>,autolearn=spam
245
246 "name" and "VALUE" must not contain "=" or "," characters, as it is
247 important that these log lines are easy to parse.
248
249 The code reference will be called by spamd after the message has
250 been scanned, and the "PerMsgStatus::check()" method has returned.
251
252 $status->finish ()
253 Indicate that this $status object is finished with, and can be
254 destroyed.
255
256 If you are using SpamAssassin in a persistent environment, or
257 checking many mail messages from one "Mail::SpamAssassin" factory,
258 this method should be called to ensure Perl's garbage collection
259 will clean up old status objects.
260
261 $name = $status->get_current_eval_rule_name()
262 Return the name of the currently-running eval rule. "undef" is
263 returned if no eval rule is currently being run. Useful for
264 plugins to determine the current rule name while inside an eval
265 test function call.
266
267 $status->get_decoded_body_text_array ()
268 Returns the message body, with base64 or quoted-printable encodings
269 decoded, and non-text parts or non-inline attachments stripped.
270
271 This is the same result text as used in 'rawbody' rules.
272
273 It is returned as an array of strings, with each string being a
274 2-4kB chunk of the body, split from boundaries if possible.
275
276 $status->get_decoded_stripped_body_text_array ()
277 Returns the message body, decoded (as described in
278 get_decoded_body_text_array()), with HTML rendered, and with
279 whitespace normalized.
280
281 This is the same result text as used in 'body' rules.
282
283 It will always render text/html.
284
285 It is returned as an array of strings, with each string
286 representing one 'paragraph'. Paragraphs, in plain-text mails, are
287 double-newline-separated blocks of multi-line text.
288
289 $status->get (header_name [, default_value])
290 Returns a message header, pseudo-header, real name or address.
291 "header_name" is the name of a mail header, such as 'Subject',
292 'To', etc. If "default_value" is given, it will be used if the
293 requested "header_name" does not exist.
294
295 Appending ":raw" to the header name will inhibit decoding of
296 quoted-printable or base-64 encoded strings.
297
298 Appending a modifier ":addr" to a header field name will cause
299 everything except the first email address to be removed from the
300 header field. It is mainly applicable to header fields 'From',
301 'Sender', 'To', 'Cc' along with their 'Resent-*' counterparts, and
302 the 'Return-Path'. For example, all of the following will result in
303 "example@foo":
304
305 example@foo
306 example@foo (Foo Blah)
307 example@foo, example@bar
308 display: example@foo (Foo Blah), example@bar ;
309 Foo Blah <example@foo>
310 "Foo Blah" <example@foo>
311 "'Foo Blah'" <example@foo>
312
313 Appending a modifier ":name" to a header field name will cause
314 everything except the first display name to be removed from the
315 header field. It is mainly applicable to header fields containing a
316 single mail address: 'From', 'Sender', along with their
317 'Resent-From' and 'Resent-Sender' counterparts. For example, all
318 of the following will result in "Foo Blah". One level of single
319 quotes is stripped too, as it is often seen.
320
321 example@foo (Foo Blah)
322 example@foo (Foo Blah), example@bar
323 display: example@foo (Foo Blah), example@bar ;
324 Foo Blah <example@foo>
325 "Foo Blah" <example@foo>
326 "'Foo Blah'" <example@foo>
327
328 There are several special pseudo-headers that can be specified:
329
330 "ALL" can be used to mean the text of all the message's headers.
331 Each header is decoded and unfolded to single line, unless called
332 with :raw.
333 "ALL-TRUSTED" can be used to mean the text of all the message's
334 headers that could only have been added by trusted relays.
335 "ALL-INTERNAL" can be used to mean the text of all the message's
336 headers that could only have been added by internal relays.
337 "ALL-UNTRUSTED" can be used to mean the text of all the message's
338 headers that may have been added by untrusted relays. To make this
339 pseudo-header more useful for header rules the 'Received' header
340 that was added by the last trusted relay is included, even though
341 it can be trusted.
342 "ALL-EXTERNAL" can be used to mean the text of all the message's
343 headers that may have been added by external relays. Like
344 "ALL-UNTRUSTED" the 'Received' header added by the last internal
345 relay is included.
346 "ToCc" can be used to mean the contents of both the 'To' and 'Cc'
347 headers.
348 "EnvelopeFrom" is the address used in the 'MAIL FROM:' phase of the
349 SMTP transaction that delivered this message, if this data has been
350 made available by the SMTP server.
351 "MESSAGEID" is a symbol meaning all Message-Id's found in the
352 message; some mailing list software moves the real 'Message-Id' to
353 'Resent-Message-Id' or 'X-Message-Id', then uses its own one in the
354 'Message-Id' header. The value returned for this symbol is the
355 text from all 3 headers, separated by newlines.
356 "X-Spam-Relays-Untrusted" is the generated metadata of untrusted
357 relays the message has passed through
358 "X-Spam-Relays-Trusted" is the generated metadata of trusted relays
359 the message has passed through
360 $status->get_uri_list ()
361 Returns an array of all unique URIs found in the message. It takes
362 a combination of the URIs found in the rendered (decoded and HTML
363 stripped) body and the URIs found when parsing the HTML in the
364 message. Will also set $status->{uri_list} (the array as returned
365 by this function).
366
367 The returned array will include the "raw" URI as well as "slightly
368 cooked" versions. For example, the single URI
369 'http://%77w%77.example.com/' will get turned into: (
370 'http://%77w%77.example.com/', 'http://www.example.com/' )
371
372 $status->get_uri_detail_list ()
373 Returns a hash reference of all unique URIs found in the message
374 and various data about where the URIs were found in the message.
375 It takes a combination of the URIs found in the rendered (decoded
376 and HTML stripped) body and the URIs found when parsing the HTML in
377 the message. Will also set $status->{uri_detail_list} (the hash
378 reference as returned by this function).
379
380 The hash format looks something like this:
381
382 raw_uri => {
383 types => { a => 1, img => 1, parsed => 1, domainkeys => 1,
384 unlinked => 1, schemeless => 1 },
385 cleaned => [ canonicalized_uri ],
386 anchor_text => [ "click here", "no click here" ],
387 domains => { domain1 => 1, domain2 => 1 },
388 hosts => { host1 => domain1, host2 => domain2 },
389 }
390
391 "raw_uri" is whatever the URI was in the message itself
392 (http://spamassassin.apache%2Eorg/). Uris parsed from text will be
393 prefixed with scheme if missing (http://, mailto: etc). HTML uris
394 are as found.
395
396 "types" is a hash of the HTML tags (lowercase) which referenced the
397 raw_uri. parsed is a faked type which specifies that the raw_uri
398 was seen in the rendered text. domainkeys is defined when raw_uri
399 was found from DK/DKIM d= field. unlinked is defined when it's
400 assumed that MUA will not linkify uri (found in body without scheme
401 or www. prefix). schemeless is always added for uris without
402 scheme, regardless of linkifying (i.e. email address found in body
403 without mailto:).
404
405 "cleaned" is an array of the raw and canonicalized version of the
406 raw_uri (http://spamassassin.apache%2Eorg/,
407 https://spamassassin.apache.org/).
408
409 "anchor_text" is an array of the anchor text (text between <a> and
410 </a>), if any, which linked to the URI.
411
412 "domains" is a hash of the domains found in the canonicalized URIs.
413
414 "hosts" is a hash of unstripped hostnames found in the
415 canonicalized URIs as hash keys, with their domain part stored as a
416 value of each hash entry.
417
418 $status->add_uri_detail_list ($raw_uri, $types, $source, $valid_domain)
419 Adds values to internal uri_detail_list. When used from Plugins,
420 recommended to call from parsed_metadata (along with
421 register_method_priority, -10) so other Plugins calling
422 get_uri_detail_list() will see it.
423
424 "raw_uri" is the URI to be added. The only required parameter.
425
426 "types" is an optional hash reference, contents are added to
427 uri_detail_list->{types} (see get_uri_detail_list for known keys).
428 parsed is default is no hash given. nocanon does not run
429 uri_list_canonicalize (no redirector, uri fixing). noclean skips
430 adding uri_detail_list->{cleaned}, so it would not be used in "uri"
431 rule checks, but domain/hosts would still be used for URIBL/RBL
432 purposes.
433
434 "source" is an optional simple string, only used for debug logging
435 purposes to identify where uri originates from (default: "parsed").
436
437 "valid_domain" is an optional boolean (0/1). If true, uri will not
438 be added unless hostname/domain is in valid format and contains a
439 valid TLD. (default: 0)
440
441 $status->clear_test_state()
442 Clear test state, including test log messages from
443 "$status->test_log()".
444
445 $status->got_hit ($rulename, $desc_prepend [, name => value, ...])
446 Register a hit against a rule in the ruleset.
447
448 There are two mandatory arguments. These are $rulename, the name of
449 the rule that fired, and $desc_prepend, which is a short string
450 that will be prepended to the rules "describe" string in output
451 reports.
452
453 In addition, callers can supplement that with the following
454 optional data:
455
456 score => $num
457 Optional: the score to use for the rule hit. If unspecified,
458 the value from the "Mail::SpamAssassin::Conf" object's
459 "{scores}" hash will be used (a configured score), and in its
460 absence the "defscore" option value.
461
462 defscore => $num
463 Optional: the score to use for the rule hit if neither the
464 option "score" is provided, nor a configured score value is
465 provided.
466
467 value => $num
468 Optional: the value to assign to the rule; the default value is
469 1. tflags multiple rules use values of greater than 1 to
470 indicate multiple hits. This value is accessible to meta
471 rules.
472
473 ruletype => $type
474 Optional, but recommended: the rule type string. This is used
475 in the "hit_rule" plugin call, called by this method. If
476 unset, 'unknown' is used.
477
478 tflags => $string
479 Optional: a string, i.e. a space-separated list of additional
480 tflags to be appended to an existing list of flags in
481 $self->{conf}->{tflags}, such as: "nice noautolearn multiple".
482 No syntax checks are performed.
483
484 description => $string
485 Optional: a custom rule description string. This is used in
486 the "hit_rule" plugin call, called by this method. If unset,
487 the static description is used.
488
489 Backward compatibility: the two mandatory arguments have been part
490 of this API since SpamAssassin 2.x. The optional name=<gtvalue>
491 pairs, however, are a new addition in SpamAssassin 3.2.0.
492
493 $status->create_fulltext_tmpfile (fulltext_ref)
494 This function creates a temporary file containing the passed scalar
495 reference data (typically the full/pristine text of the message).
496 This is typically used by external programs like pyzor and dccproc,
497 to avoid hangs due to buffering issues. Methods that need this,
498 should call $self->create_fulltext_tmpfile($fulltext) to retrieve
499 the temporary filename; it will be created if it has not already
500 been.
501
502 Note: This can only be called once until
503 $status->delete_fulltext_tmpfile() is called.
504
505 $status->delete_fulltext_tmpfile ()
506 Will cleanup after a $status->create_fulltext_tmpfile() call.
507 Deletes the temporary file and uncaches the filename.
508
509 all_from_addrs_domains
510 This function returns all the various from addresses in a message
511 using all_from_addrs() and then returns only the domain names.
512
514 Mail::SpamAssassin(3) spamassassin(1)
515
516
517
518perl v5.36.0 2022-07-23Mail::SpamAssassin::PerMsgStatus(3)