Spam filter headers

glocal · ‎11-09-2007

As reported by others, PN's spam filter seems to be a bit too aggressive with some messages from contacts known to me. I've been submitting these messages to notspam but the training doesn't seem to work. Similar messages from the same sources are again detected as spam. I am not sure I interprete the x-dspam headers correctly, but looking at the headers below I don't see what is so suspicious about such words as 'by PlusNet MXCore' or 'Exim 4.43', or the date . Any suggestions?
X-PN-Spam-Filtered: by PlusNet MXCore (v3.00)
X-DSPAM-Result: Spam
X-DSPAM-Processed: Sat Oct 6 17:23:10 2007
X-DSPAM-Confidence: 0.5635
X-DSPAM-Improbability: 1 in 130 chance of being ham
X-DSPAM-Probability: 1.0000
X-DSPAM-Factors: 15,
pote, 0.99000,
Received*(Exim+4.43), 0.99000,
Received*reg.co.uk+with, 0.99000,
Date*Sat+6, 0.99000,
Received*4.43), 0.99000,
<FONT+face="Century, 0.03238,
Received*smtp+(Exim, 0.04287,
Subject*disk, 0.06274,
Date*6, 0.08177,
Received*7E, 0.12324,
Received*reg.co.uk, 0.12453,
Delivery-date*23+09, 0.85764,
X-PN-Spam-Filtered*PlusNet+MXCore, 0.85036,
X-PN-Spam-Filtered*by, 0.85036,
X-PN-Spam-Filtered*MXCore, 0.85036
Michael

bobpullen · ‎04-04-2007

Quote from: glocal
X-PN-Spam-Filtered*PlusNet+MXCore, 0.85036,
X-PN-Spam-Filtered*by, 0.85036,
X-PN-Spam-Filtered*MXCore, 0.85036

I'll query this the next time I'm in the office, however I suspect that it's simply the nature of the spam training engine. You've got to bear in mind that every email used for training will have gone through these servers whether spam or ham.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

glocal · ‎11-09-2007

Well yes, but having checked headers from several messages I notice that under X-DSPAM-Factors there is always a list of what seems to be selected suspicious strings, followed by a spam score. If this is true, I don't see why these three strings would be given such high scores, considering that 0.99 appears to be the highest score. These three headers are routinely inserted by PN and are found in all messages anyway. The high score given to the delivery date (which looks fine to me) is another mystery. Selecting Exim as a string is also puzzling. Obviously I am only trying to guess but it all looks very strange, and these factors seem to increase the likelyhood of false positives.

bobpullen · ‎04-04-2007

glocal,
Understand entirely what you're saying and will ask one of our platform development guys what their take is on your observations.

Bob Pullen
Plusnet Product Team
If I've been helpful then please give thanks ⤵

mapletree · ‎28-07-2007

I have also noticed that the spam filter seems to be getting less accurate all the time. It has let through emails that are very obviously spam and I have also noticed an increase in false positives. It now puts emails from a DVD club I belong to in the spam folder.
See:
X-DSPAM-Result: Spam
X-DSPAM-Processed: Mon Oct 8 18:40:30 2007
X-DSPAM-Confidence: 0.4981
X-DSPAM-Improbability: 1 in 100 chance of being ham
X-DSPAM-Probability: 1.0000
X-DSPAM-Factors: 15,
Team+P, 0.00301,
Delivery-date*Mon+08, 0.00624,
Prioritise, 0.00864,
X-Mailer*Q2.21), 0.99000,
Received*08+Oct, 0.99000,
Received*500), 0.99000,
X-Mailer*B2.21+Q2.21), 0.99000,
Date*19+UT, 0.99000,
Received*0003ib, 0.99000,
Date*39+19, 0.99000,
Don't+hold, 0.99000,
high+priority, 0.01000,
a+masterpiece, 0.99000,
Remember+your, 0.01000,
Fill+up, 0.99000
Why is "Received" and "Date" an indicator of spam?

Spam filter headers

Spam filter headers

Re: Spam filter headers

Re: Spam filter headers

Re: Spam filter headers

Re: Spam filter headers