[Spambayes] question regarding training

Michael Kimball michael at kimballpottery.com
Wed Aug 11 16:00:45 CEST 2004


Missy wrote:
> 
> I mistakenly sent this directly to Tony...my question is, how do I get it to
> "train on mistakes"?

In Configuration | Advanced Configuration | Interface Options set both
'Default training for ham' and 'Default training for spam' to 'discard',
and 'Default training for unsure' to 'defer'.  Then when you "Review
Messages' click the appropriate 'ham' or 'spam' radio button for each of
the 'unsure' e-mails, and in the 'ham' and 'spam' categories, click the
appropriate radio button only for those that are incorrectly
classified.  When done with those changes, click the "Train" button. 
You'll see the message saying 'Done. Trained on # messages' where # is
the number of messages that weren't left at 'discard'.

> 
> -----Original Message-----
> From: Tony Meyer [mailto:tameyer at ihug.co.nz]
> Sent: Tuesday, August 10, 2004 1:32 AM
> To: 'Missy'; spambayes at python.org
> Subject: RE: [Spambayes] question regarding training
> 
> > I have noticed that on my Spambayes manager, it has way more spam than
> > ham.  It also states that it works best when there are equal amounts
> > of both.
> > What can I do to make it work more efficiently?
> 
> This is getting to be a FAQ!
> 
> Firstly, if you are not already, then doing "train on mistakes" is a good
> idea.  Basically, the only training you do is on mail that ends up in the
> 'unsure' folder, and any false positives (good mail in spam folder) and
> false negatives (vice versa), if there are any.  This should reduce the
> imbalance, and make it grow less quickly.
> 
> If you get a lot of mail in the 'unsure' folder, you can adjust the
> thresholds (Filtering tab), to try and reduce it.
> 
> If you get multiple copies of a spam message, don't "Delete as spam" all of
> them, just one, and move the rest to the spam folder (or Deleted Items)
> manually.
> 
> Don't worry too much about the imbalance as long as things are working well
> enough.  Particularly if it's a small imbalance (like 3::1) rather than a
> large one (like 100:1).
> 
> (Longer term, the developers are trying to figure out ways to help people
> with this problem, but that's a way off yet).
> 
> =Tony Meyer
> 
> ---
> Please always include the list (spambayes at python.org) in your replies
> (reply-all), and please don't send me personal mail about SpamBayes. This
> way, you get everyone's help, and avoid a lack of replies when I'm busy.
> 
> _______________________________________________
> Spambayes at python.org
> http://mail.python.org/mailman/listinfo/spambayes
> Check the FAQ before asking: http://spambayes.sf.net/faq.html
> 
> --
> Incoming mail is certified Virus Free.
> Checked by AVG anti-virus system (http://www.grisoft.com).
> Version: 6.0.736 / Virus Database: 490 - Release Date: 8/9/2004


-- 
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.736 / Virus Database: 490 - Release Date: 8/9/2004



More information about the Spambayes mailing list