[Spambayes] RE: Spambayes Digest, Vol 73, Issue 33

Kenny Pitt kennypitt at hotmail.com
Fri Sep 24 16:56:38 CEST 2004


The POP3 Proxy version of SpamBayes (and maybe some others) support an
option to train every message that SpamBayes classifies without any user
intervention. If SpamBayes thinks a message is good then it will train it as
good. That's what we mean by training automatically.
 
The Outlook Add-in has never supported this. The user has to explicitly
choose to train a message either by dragging it into or out of the spam
folder, or by clicking the "Delete As Spam" or "Recover From Spam" buttons.
There are some options in SpamBayes Manager that control the behavior of
this manual training, but nothing to enable any sort of automatic training.
I would be interested in knowing which options you thought meant automatic
training.
 
Training automatically on everything that SpamBayes classifies is what we
refer to as the "Train On Everything" strategy. You can read more about it
here on the Wiki:
 
http://entrian.com/sbwiki/TrainOnEverything
 
The reason that the Outlook add-in does not support this is that it can be
very dangerous for a casual user. First, if you receive a lot of e-mail then
your training database size will get very large very quickly. Also, if you
receive a lot more spam than ham (or vice versa) then your training will get
out of balance very quickly.
 
Finally, and probably most important, it requires a lot of diligence in
checking for mistakes. If you only train manually on mistakes and unsures,
then the worst that happens if you fail to notice a mistake that SpamBayes
made is that message gets discarded and never heard from again. On the other
hand, if you are automatically training on everything and SpamBayes makes a
mistake, then it has already been added to your training under the wrong
classification. If you fail to notice that mistake, then the incorrect clues
will remain in your database and will negatively affect the classification
of any similar messages you receive in the future.
 
-- 
Kenny Pitt
 


  _____  

From: Windhorn, Allen, E. [LS/MKT] [mailto:Allen.Windhorn at LSUSA.com] 
Sent: Thursday, September 23, 2004 5:38 PM
To: 'spambayes at python.org'
Cc: 'kennypitt at hotmail.com'
Subject: RE: Spambayes Digest, Vol 73, Issue 33



Kenny & group, 

Message: 6 
Date: Thu, 23 Sep 2004 17:00:04 -0400 
From: "Kenny Pitt" <kennypitt at hotmail.com> 

> ...  Note that if you do not explicitly train on some 
> messages then the database will not update.  SpamBayes 
> never trains automatically in the Outlook Add-in version. 

What?!  Is this true?  If so, why not?  Why does it have check boxes to
enable this then? 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/spambayes/attachments/20040924/48f7e6ef/attachment.html


More information about the Spambayes mailing list