[Spambayes] Tough to classify
David Shaw
david at theresistance.net
Sat Apr 12 22:39:37 EDT 2003
I placed an order with Amazon today. I got a TiVo and a Java book.
The order confirmation came back unsure, with 123 clues pointing both
ways, and probability as follows:
*H* 0.981571864474
*S* 0.56420331545
This message is obviously ham to a human, but here are some of the
higher spam clues:
find, 0.908163265306
day? 0.908163265306
$5,000. 0.908163265306
20, 0.908163265306
url:help 0.908163265306
telephone: 0.934782608696
order: 0.934782608696
buy 0.942237128563
saver 0.949438202247
seller 0.96511627907
online, 0.983271375465
ordering 0.987106017192
dollar 0.987106017192
grand 0.988431876607
shopping 0.992091388401
tax 0.994699646643
subject:with 0.99504950495
subject:Your 0.997366881217
What can be done in a case like this? I don't order from amazon that
often (maybe 4 times a year), but amazon itself is a ham clue:
url:amazon 0.155172413793
I feel like spambayes has enough clues to know this is ham, it's just a
question of calculating the probability in such a way as to recognize
it. I would be interesting in any thoughts on this.
More information about the Spambayes
mailing list