[Spambayes] Latest spammer trick stymied

Richard Jowsey richard at jowsey.com
Tue Apr 1 09:43:08 EST 2003


> That's right.  We really should try to solve this problem with
> tokenization.

You're quite right. My initial response to these spams was ripping up 
the url, then checking every possible fragment (>= 3 chars) against 
the database. This proved reasonably effective when there were enough 
additional spam-words in the message to shift the classifier out of 
"unsure". 

However, these new spams appear designed to provide us the absolute 
minimum number of clues. Additional tokenization logic probably won't 
help much, but I'd be delighted if we could figure out a better way!

Cheers,
Richard



More information about the Spambayes mailing list