[Spambayes] Question about the ratio of Spam to Ham you should
train on...
Richie Hindle
richie at entrian.com
Sun Oct 3 14:18:35 CEST 2004
[Andrew]
> I know you're supposed to train Spambayes on a roughly equal
> amount of Spam and Ham. Does that mean you should try to train on one
> new Ham for every Spam you train, even if all your Ham is already being
> correctly identified by Spambayes?
> I get VASTLY more Spam than good mail, and in the last month of using
> Spambayes I've ended up training on over 200 spams, and only 33 hams.
[Graham]
> I'm in a similar position, and would be really interested in the
> opinions of the developers. I tend to train on my (already correctly
> classified) ham, just to try and keep the numbers even.
I personally try to keep the numbers even, by training on
correctly-classified ham. The fact that it's already correctly classified
doesn't mean that training on it is no use - it's still worth doing.
There's been a lot written on the wiki about training strategies - start
at http://www.entrian.com/sbwiki/TrainingIdeas
--
Richie Hindle
richie at entrian.com
More information about the Spambayes
mailing list