[spambayes-dev] A spectacular false positive

Richie Hindle richie at entrian.com
Sat Nov 15 11:37:37 EST 2003


[Richie]
> Perhaps it's argument for not classifying using hapaxes?  Wait for any
> given clue to appear in more than one message before it becomes valid for
> classification.  Has anyone tried this?  (And not just for SpamBayes -
> Bill?)

[Rob]
> Í hävè nöt tríéd ìt, büt Î äm qûìtë sürè ít wöûld pérfòrm wòrsë!

8-)

I'm sure it would perform worse in the short term, but as the size of the
training set increased, I think the performance would pretty much catch up
while the chance of false positives would remain significantly smaller.
(I speak with the conviction of someone with no evidence and negligible
mathematical ability...)

-- 
Richie Hindle
richie at entrian.com




More information about the spambayes-dev mailing list