[Spambayes] Problem with POP3 Proxy

Tony Meyer tameyer at ihug.co.nz
Mon Feb 16 19:51:34 EST 2004


> Thanks for the quick response!  Look forward to seeing 
> 1.0a10, or 1.0aPI^e, or whatever the next one is.

:)  It should be 1.0b1, but it's looking like maybe it'll be 1.0a10.

> In related news - despite numbers like these:  [Total emails 
> trained: Spam: 1549  Ham: 888], I've been pretty consistently 
> getting 5-10 "unsures" a day... and recently, a couple of 
> false negatives, too.  (Though only one or two false 
> positives over the life of the installation, which is great.) 
>  I wonder if this is due to smarter / more pathological spam 
> in recent days.

The only real way to figure out why they're scoring what they are is to look
at the clues for the message.  If you can't figure out why it's scored what
it has, then feel free to post an example set of clues here and we can try
and figure it out for you.

5-10 "unsures" per day could be ok, BTW, depending on how much mail you get.
Most of the testing, IIRC, tends to result in 2-5% unsure, so if you're
getting 250-500 messages per day (easy enough with a few high volume lists),
then this is a pretty reasonable result.  Are the false negatives a couple
per day, or a couple every now and then?  These should be much less common,
although if they're something quite different to what you've trained on
before, hard to avoid.

One thing that can make a difference is the training 'regime' that you use.
The wiki (http://entrian.com/sbwiki) has lots of details, but the three most
common are 'train on everything', 'mistake based training' (train on
unsures, false positives, and false negatives) and 'nonedge training' (train
on anything within given edges, say 0.05 and 0.95).  The latter two are
usually the most successful, with nonedge probably slightly in the lead.

=Tony Meyer

---
Please always include the list (spambayes at python.org) in your replies
(reply-all), and please don't send me personal mail about SpamBayes. This
way, you get everyone's help, and avoid a lack of replies when I'm busy.




More information about the Spambayes mailing list