[spambayes-dev] A spectacular false positive

Richie Hindle richie at entrian.com
Sat Nov 15 11:06:52 EST 2003


[Tim]
> There were about a half dozen strong ham
> clues that it had come from him, but about 140 spam clues from the variety
> of little integers, most hapaxes that had appeared in one training spam
> each.

Perhaps it's argument for not classifying using hapaxes?  Wait for any
given clue to appear in more than one message before it becomes valid for
classification.  Has anyone tried this?  (And not just for SpamBayes -
Bill?)

It could well have helped with the similar spectacular false positive that
I reported a few weeks ago - that was from a colleague as well, and
consisted of a list of US state codes and state names.  Many of those were
spam hapaxes.

-- 
Richie Hindle
richie at entrian.com




More information about the spambayes-dev mailing list