[Spambayes] test sets?

Guido van Rossum guido@python.org
Sat, 07 Sep 2002 16:19:16 -0400


> [Guido]
> > Perhaps more useful would be if Tim could check in the pickle(s?)
> > generated by one of his training runs, so that others can see how
> > Tim's training data performs against their own corpora.

[Tim]
> I did that yesterday, but seems like nobody bit.

I downloaded and played with it a bit, but had no time to do anything
systematic.  It correctly recognized a spam that slipped through SA.
But it also identified as spam everything in my inbox that had any
MIME structure or HTML parts, and several messages in my saved 'zope
geeks' list that happened to be using MIME and/or HTML.

So I guess I'll have to retrain it (yes, you told me so :-).

> Just in case <wink>, I
> uploaded a new version just now.  Since MINCOUNT went away, UNKNOWN_SPAMPROB
> is much less likely, and there's almost nothing that can be pruned away (so
> the file is about 5x larger now).
> 
>     http://sf.net/project/showfiles.php?group_id=61702

I'll try this when I have time.

--Guido van Rossum (home page: http://www.python.org/~guido/)