[spambayes-dev] Another incremental training idea...
Seth Goodman
nobody at spamcop.net
Tue Jan 13 18:26:47 EST 2004
[Kenny Pitt]
> I've also been kicking around some auto-training ideas hoping for time
> to try them. One idea I had was based on a "sliding non-edge" scale.
> You would set a max imbalance, say 2:1, beyond which you would train on
> everything on the low side. As your imbalance falls back below the
> maximum, auto-train would start skipping the "edge" messages with near
> perfect classification scores. The closer you get to a perfect 1:1
> balance, the closer to the cutoff score the message would need to be
> before it would get auto-trained. Anyone see any obvious holes in this
> idea?
No obvious problems to me.
Another related idea is to dynamically move the edge thresholds until the
training ratio averages 1:1.
--
Seth Goodman
Humans: off-list replies to sethg [at] GoodmanAssociates [dot] com
Spambots: disregard the above
More information about the spambayes-dev
mailing list