[Spambayes] How low can you go?

Tim Peters tim.one at comcast.net
Sat Dec 13 18:22:23 EST 2003


[Gerrit Holl]
> ...
> My father is using non-bayesian spamassasin, and it seems the
> spamassasin manpage warns that without 'hundreds of messages' bayesian
> spamfiltering is unusable. This is obviously incorrect for Spambayes.
> Spambayes comes with no knowledge. Does it have a more intelligent
> algorithm? Or is the warning in the spamassasin manpage incorrect?

I'm afraid there are several research projects hiding in there, so we'll
probably never know.  For an individual, SpamBayes does much better than
chance after training on 1 ham and 1 spam, and even just that much can
*help* keep your inbox saner.  If a single SpamBayes was trying to filter
email for several people, though, it may even do worse than chance after
training on 1 of each (don't know -- haven't tried; "head arguments" can be
made in any direction here; for example, if the single spam trained on was
addressed to me, and the single ham to you, then there's systematic pressure
to call everything addressed to me spam, and everything addressed to you
ham).




More information about the Spambayes mailing list