[Spambayes] Re: Training oddity/confusion

Tony Meyer tameyer at ihug.co.nz
Thu Jan 13 21:19:37 CET 2005


> >With 'classic' train to exhaustion, the database is kept exactly 
> >balanced, I believe.  How well is your system working for you?
> 
> Erm, not all that well. :|

:(  I'm trying to get things rearranged a little for 1.1 so that it's easier
to try out different training regimes (including tte) with the various apps,
so hopefully that'll help.

> My incoming mail is very unbalanced - 17:1 spam:ham since I 
> started the training - which can't help, but so far I have 
> 18% unsure spam and 3% false negatives. No mistakes on ham 
> though; none scored higher than 0.5%. Given that, I suppose I 
> could simply mess with the thresholds.

I've read reports of people who have done that (in an extreme way, so that
the cutoffs are 5% and 10% or something like that).  It seems pretty risky
to me, though, since a message that contains nothing that has been seen
before will score 0.5 and that would be same under that system...

=Tony.Meyer

-- 
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes.
http://www.massey.ac.nz/~tameyer/writing/reply_all.html explains this.



More information about the Spambayes mailing list