[spambayes-dev] Another incremental training idea...

Barry Warsaw barry at python.org
Tue Jan 13 18:33:43 EST 2004


On Tue, 2004-01-13 at 18:21, Skip Montanaro wrote:

> This of course jives pretty well with many peoples' observation (and my
> experience) that most unsures are actually spam.  I think I need to adjust
> some thresholds to try and reduce the number of spams which get trained on.

I'm still training on errors and had very good results, with an
occasional reset of my spam train folder.  I see everything, including
mailing list traffic and admin notifications.  I just started to train
admin notices that contained attached spam (i.e. auto-discards and hold
messages), so now my unsures have started to go up as have false
negatives.  It's starting to stabilize though because I often see the
held messages as pure spam too, so as I train on those, I'm guessing the
differences between the wrapped and unwrapped spam is becoming more
evident.  In any case, fp rate is extremely low -- I haven't seen one in
the several weeks since I blew away my database and retrained.

All in all, train-on-error /seems/ good enough for me.  It does the two
things I really want it to do: moves almost all my spam to a separate
folder which I can check much less often, and give me no false
positives.

-Barry





More information about the spambayes-dev mailing list