1DSPAM(1) DSPAM DSPAM(1)
2
3
4
6 dspam - DSPAM Anti-Spam Agent
7
8
10 dspam [--mode=teft|toe|tum|notrain|unlearn] [--user user1
11 user2 ... userN] [--feature=noise|no,tb=N,whitelist|wh]
12 [--class=spam|innocent] [--source=error|corpus|inoculation] [--pro‐
13 file=PROFILE] [--deliver=spam,innocent|nonspam,summary,stdout] [--help]
14 [--version] [--process] [--classify] [--signature=signature] [--stdout]
15 [--debug] [--daemon] [--nofork]] [--client] [--rcpt-to recipi‐
16 ent-address(es)] [--mail-from=sender-address] [passthru-delivery-argu‐
17 ments]
18
19
21 The DSPAM agent provides a direct interface to mail servers for com‐
22 mand-line spam filtering. The agent can masquerade as the mail server's
23 local delivery agent and will process any email passed to it. The agent
24 will then call whatever delivery agent was specified at compile time or
25 quarantine/tag/drop messages identified as spam. The DSPAM agent can
26 function locally or as a proxy. It is also responsible for processing
27 classification errors so that DSPAM can learn from its mistakes.
28
29
31 --user user1 user2 ... userN
32 Specifies the destination users of the incoming message. In most
33 cases this is the local user on the system, however some imple‐
34 mentations may call for virtual usernames, specific to DSPAM, to
35 be assigned. The agent processes an incoming message once for
36 each user specified. If the message is to be delivered, the $u
37 (or %u) parameters of the argument string will be interpolated
38 for the current user being processed.
39
40
41 --mode=toe|tum|teft|notrain
42 Configures the training mode to be used for this process, over‐
43 riding any defaults in dspam.conf or the preference extension:
44
45 teft : Train-Everything. Trains on all messages processed. This
46 is a very thorough training approach and should be considered
47 the standard training approach for most users. TEFT may, how‐
48 ever, prove too volatile on installations with extremely high
49 per-user traffic, or prove not very scalable on systems with
50 extremely large user-bases. In the event that TEFT is proving
51 ineffective, one of the other modes is recommended.
52
53 toe : Train-on-Error. Trains only on a classification error,
54 once the user's metadata has matured to 2500 innocent messages.
55 This training mode is much less resource intensive, as only
56 occasional metadata writes are necessary. It is also far less
57 volatile than the TEFT mode of training. One drawback, however,
58 is that TOE only learns when DSPAM has made a mistake - which
59 means the data is sometimes too static, and unable to "ease
60 into" a different type of behavior.
61
62 tum : Train-until-Mature. This training mode is a hybrid between
63 the other two training modes and provides a great balance
64 between volatility and static metadata. TuM will train on a
65 per-token basis only tokens which have had fewer than 25 "hits"
66 on them, unless an error is being retrained in which case all
67 tokens are trained. This training mode provides a solid core of
68 stable tokens to keep accuracy consistent, but also allows for
69 dynamic adaptation to any new types of email behavior a user
70 might be experiencing.
71
72 notrain : No training. Do not train the user's data, and do not
73 keep totals. This should only be used in cases where you want to
74 process mail for a particular user (based on a group, for exam‐
75 ple), but don't want the user to accumulate any learning data.
76
77 unlearn : Unlearn original training. Use this if you wish to
78 unlearn a previously learned message. Be sure to specify
79 --source=error and --class to whatever the original classifica‐
80 tion the message was learned under. If not using TrainPristine,
81 this will require the original signature from training.
82
83
84 --feature=noise|no,whitelist|wh,tb=N
85 Specifies the features that should be activated for this filter
86 instance. The following features may be used individually or
87 combined using a comma as a delimiter:
88
89 (no)ise : Bayesian Noise Reduction (BNR). Bayesian Noise Reduc‐
90 tion kicks in at 2500 innocent messages and provides an advanced
91 progressive noise logic to reduce Bayesian Noise (wordlist
92 attacks) in spams. See http://www.zdziarski.com/papers/bnr.html
93 for more information.
94
95 (tb)=N : Sets the training loop buffering level. Training loop
96 buffering is the amount of statistical sedation performed to
97 water down statistics and avoid false positives during the
98 user's training loop. The training buffer sets the buffer sensi‐
99 tivity, and should be a number between 0 (no buffering whatso‐
100 ever) to 10 (heavy buffering). The default is 5, half of what
101 previous versions of DSPAM used. To avoid dulling down statis‐
102 tics at all during the training loop, set this to 0.
103
104 (wh)itelist : Automatic whitelisting. DSPAM will keep track of
105 the entire "From:" line for each message received per user, and
106 automatically whitelist messages from senders with more than 20
107 innocent messages and zero spams. Once the user reports a spam
108 from the sender, automatic whitelisting will automatically be
109 deactivated for that sender. Since DSPAM uses the entire "From:"
110 line, and not just the sender's email address, automatic
111 whitelisting is a very safe approach to improving accuracy espe‐
112 cially during initial training.
113
114 NOTE: : None of the present features are necessary when the
115 source is "error", because the original training data is used
116 from the signature to retrain, instantiating whatever features
117 (such as whitelisting) were active at the time of the initial
118 classification. Since BNR is only necessary when a message is
119 being classified, the --feature flag can be safely omitted from
120 error source calls.
121
122
123 --class=spam|innocent
124 Identifies the disposition (if any) of the message being pre‐
125 sented. This flag should be used when a misclassification has
126 occured, when the user is corpus-feeding a message, or when an
127 inoculation is being presented. This flag should not be used for
128 standard processing. This flag must be used in conjunction with
129 the --source flag. Omitting this flag causes DSPAM to determine
130 the disposition of the message on its own (the standard operat‐
131 ing mode).
132
133
134 --source=error|corpus|inoculation
135 Where --class is used, the source of the classification must
136 also be provided. The source tells dspam how to learn the mes‐
137 sage being presented:
138
139 error : The message being presented was a message previously
140 misclassified by DSPAM. When ´error´ is provided as a source,
141 DSPAM requires that the DSPAM signature be present in the mes‐
142 sage, and will use the signature to recall the original training
143 metadata. If the signature is not present, the message will be
144 rejected. In this source mode, DSPAM will also decrement each
145 token's previous classification's count as well as the user
146 totals.
147
148 You should use error only when DSPAM has made an error in clas‐
149 sifying the message, and should present the modified version of
150 the message with the DSPAM signature when doing so.
151
152 corpus : The message being presented is from a mail corpus, and
153 should be trained as a new message, rather than re-trained based
154 on a signature. The message's full headers and body will be ana‐
155 lyzed and the correct classification will be incremented, with‐
156 out its opposite being decremented.
157
158 You should use corpus only when feeding messages in from corpus.
159
160 inoculation : The message being presented is in pristine form,
161 and should be trained as an inoculation. Inoculations are a more
162 intense mode of training designed to cause DSPAM to train the
163 user's metadata repeatedly on previoulsy unknown tokens, in an
164 attempt to vaccinate the user from future messages similar to
165 the one being presented. You should use inoculation only on hon‐
166 eypots and the like.
167
168
169 --profile=PROFILE
170 Specify a storage profile from dspam.conf. The storage profile
171 selected will be used for all database connectivity. See
172 dspam.conf for more information.
173
174
175 --deliver=spam,innocent|nonspam,summary,stdout
176 Tells DSPAM to deliver the message if its result falls within
177 the criteria specified. For example, --deliver=innocent or
178 --deliver=nonspam will cause DSPAM to only deliver the message
179 if its classification has been determined as innocent. Providing
180 --deliver=innocent,spam or --deliver=nonspam,spam will cause
181 DSPAM to deliver the message regardless of its classification.
182 This flag provides a significant amount of flexibility for non‐
183 standard implementations, where false positives may not be
184 delivered but spam is, and etcetera.
185
186 summary : Deliver (to stdout) a summary indentical to the output
187 of message classification:
188
189 X-DSPAM-Result: User; result="Innocent"; class="Innocent"; prob‐
190 ability=0.0000; confidence=1.00; signa‐
191 ture=4b11c532158749980119923
192
193 stdout : Is a shortcut for for --deliver=innocent,spam --stdout
194
195
196 --stdout
197 If the message is indeed deemed "deliverable" by the --deliver
198 flag, this flag will cause DSPAM to deliver the message to std‐
199 out, rather than the configured delivery agent.
200
201
202 --process
203 Tells DSPAM to process the message. This is the default behav‐
204 ior, and the flag is implied unless --classify is used.
205
206
207 --classify
208 Tells DSPAM to only classify the message, and not perform any
209 writes to the user's data or attempt to deliver/quarantine the
210 message. The results of a classification are printed to stdout
211 in the following format:
212
213 X-DSPAM-Result: User; result="Spam"; probability=1.0000; confi‐
214 dence=0.80
215
216 NOTE : The output of the classification is specific to a user's
217 own data, and does not include the output of any groups they
218 might be affiliated with, so it is entirely possible that the
219 message would be caught as spam by a group the user belongs to,
220 and appear as innocent in the output of a classification. To get
221 the classification for the group , use the group name as the
222 user instead of an individual.
223
224
225 --signature=signature
226 If only the signature is available for training, and not the
227 entire message, the --signature flag may be used to feed the
228 signature into DSPAM and forego the reading of stdin. DSPAM will
229 process the signature with whatever commandline classification
230 was specified.
231
232 NOTE : This should only be used with --source=error
233
234
235 --debug
236 If DSPAM was compiled with --enable-debug then using --debug
237 will turn on debugging messages.
238
239
240 --daemon
241 If DSPAM was compiled with --enable-daemon then using --daemon
242 will cause DSPAM to enter daemon mode, where it will listen for
243 DSPAM clients to connect and actively service requests.
244
245
246 --nofork
247 If DSPAM was compiled with --enable-daemon then using --nofork
248 will cause DSPAM to not fork the daemon into backgound when
249 using --daemon switch.
250
251
252 --client
253 If DSPAM was compiled with --enable-daemon then using --client
254 will cause DSPAM to act as a client and attempt to connect to
255 the DSPAM server specified in the client's configuration within
256 dspam.conf. If client behavior is desired, this option must be
257 specified, otherwise the agent simply operate as self-contained
258 and processes the message on its own, eliminating any benefit of
259 using the daemon.
260
261
262 --rcpt-to recipient-address(es)
263 If DSPAM will be configured to deliver via LMTP or SMTP, this
264 flag may be used to define the RCPT TOs which will be used for
265 the delivery of each user specified with --user If no recipients
266 are provided, the RCPT TOs will match the username.
267
268 NOTE : The recipient list should always be balanced with the
269 user list, or empty. Specifying an unbalanced number of recipi‐
270 ents to users will result in undefined behavior.
271
272
273 --mail-from=sender-address
274 If DSPAM will be cofigured to deliver via LMTP or SMTP, this
275 flag will set the MAIL FROM sent on delivery of the message. The
276 default MAIL FROM depends on how the message was originally
277 relayed to DSPAM. If it was relayed via the commandline, an
278 empty MAIL FROM will be used. If it was relayed via LMTP, the
279 original MAIL FROM will be used.
280
281
283 0 Operation was successful.
284 other Operation resulted in an error. If the error involved an error
285 in calling the delivery agent, the exit value of the delivery
286 agent will be returned.
287
288
290 Copyright © 2002-2011 DSPAM Project
291 All rights reserved.
292
293 For more information, see http://dspam.sourceforge.net.
294
295
297 dspam_admin(1), dspam_clean(1), dspam_crc(1), dspam_dump(1),
298 dspam_logrotate(1), dspam_merge(1), dspam_stats(1), dspam_train(1)
299
300
301
302DSPAM Aug 14, 2010 DSPAM(1)