[spambayes-dev] Another incremental training idea...
Eli Stevens (WG.c)
listsub at wickedgrey.com
Wed Jan 14 20:53:52 EST 2004
Tony Meyer wrote:
> Kenny Pitt wrote:
>>I've also been kicking around some auto-training ideas hoping for time
>>to try them. One idea I had was based on a "sliding non-edge" scale.
>>You would set a max imbalance, say 2:1, beyond which you would train
>>on everything on the low side.
>>As your imbalance falls back below the maximum, auto-train
>>would start skipping the "edge" messages with near perfect
>>classification scores. The closer you get to a perfect 1:1
>>balance, the closer to the cutoff score the message would
>>need to be before it would get auto-trained. Anyone see any
>>obvious holes in this idea?
>>
>
> I tried almost this with the incremental regime, using a maximum of 2::1 or
> 1::2. It did pretty consistently worse than the basic nonedge regime. The
> only difference is that I didn't choose which messages to use if an
> imbalance would be created. The idea was basically to do nonedge, except if
> there was an imbalance, and then only train messages that move the balance
> closer to 1::1.
It sounds like you are saying that non-edge messages on the heavy side
were not trained. It seems that would be a key difference. Was that
the case in your test?
Eli
More information about the spambayes-dev
mailing list