[Spambayes] More ham than spam?

Missy missyhmakr at hotmail.com
Tue Aug 31 15:17:41 CEST 2004


Can you tell me how to do this?  I read the article mentioned, but am not
sure how to do this.

Missy 

-----Original Message-----
From: spambayes-bounces at python.org [mailto:spambayes-bounces at python.org] On
Behalf Of Ferino Mardo
Sent: Tuesday, August 31, 2004 6:05 AM
To: Kenny Pitt; Ferino Mardo; spambayes at python.org
Subject: RE: [Spambayes] More ham than spam?

Replies below:

> -----Original Message-----
> From: Kenny Pitt [mailto:kennypitt at hotmail.com]
> Sent: Monday, August 30, 2004 07:40 PM
> To: Ferino Mardo; spambayes at python.org
> Subject: RE: [Spambayes] More ham than spam?
> 
> 
> Ferino Mardo wrote:
> > The SPAMbayes manager complains that I have much more ham
> than spam. 
> > What should one do? Delete his good emails to make things even?
> 
> We hear this question a lot, but most people find that they have too 
> much
> *spam* and not enough ham.  Ham messages typically have a more 
> consistent set of senders, receivers, and topics, and therefore 
> usually require less training to identify correctly than spam 
> messages.
> 
> Did you have SpamBayes train itself on some of your existing messages 
> when you first configured?  If so, you probably had a lot more ham 
> messages in your initial training set.
> 

Yes I did. I have lots of emails I consider good and only a few SPAM.
Just curious if the message mean anything other than what is the obvious.

> If you are getting acceptable accuracy from SpamBayes then don't worry 
> too much about the warning.  It's only a guideline, and how much 
> affect the imbalance has will depend on how severe the imbalance is as 
> well as on your specific mixture of e-mails.
> 

I'm getting more than acceptable accuracy from SPAMbayes. I like the
product!

> On the other hand, if your accuracy is poor then I would recommend 
> deleting your training data and retraining SpamBayes from scratch with 
> no initial training data.
> Instead, just train manually on any Unsure messages as well as 
> messages that SpamBayes identifies incorrectly (ham classified as spam 
> or vice versa).  We usually refer to this training strategy as "Train 
> on Errors and Unsures", and you can read more about it on the 
> SpamBayes wiki:
> 
http://entrian.com/sbwiki/TrainOnErrorsAndUnsures

You can also get more information about alternative training strategies
here:

http://entrian.com/sbwiki/TrainingIdeas

--
Kenny Pitt

_______________________________________________
Spambayes at python.org
http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html


More information about the Spambayes mailing list