[Spambayes] Numeric python store, hammiefilter extension and mutt macros

Tim Stone - Four Stones Expressions tim@fourstonesExpressions.com
Fri Nov 22 00:42:57 2002


Sounds really good, Adam.  Neale Pickett and I have been working on this kind 
of stuff in a branch named hammie-playground.  There have been some 
substantial changes made there, that'll be merged into the main thread soon.  
You might want to check there and see how your changes would fit in...  I 
really like your results.  Size and speed have been consistent challenges for 
us.

- TimS

11/21/2002 6:27:13 PM, Adam Hupp <hupp@cs.wisc.edu> wrote:

>
>I've been working on a store for spambayes that uses the Numeric
>python extension.  It's substantially faster than PersistentBayes and
>the database is about half the size.  A comparison, training on 992 messages:
>
>PersistentBayes:
>training: 220s
>update_prob: 3.2s
>score 1 msg: .45s
>score 6156 msgs: 58s
>
>NumericBayes:
>training: 14s
>update_prob: 0.10s
>score 1 msg: .59s
>score 6156 msgs: 49s
>
>There are no modifications to classifier.Bayes, it just uses a new
>WordInfo class with properties.
>
>I also modified hammiefilter to do untraining, retraining, and
>training on filter results.  For example:
>
>hammiefilter.py --filter --train
>
>The incoming message is scored and filtered.  If the result is not
>"Unsure" the classifier will be trained on it.
>
>
>hammiefilter.py --reverse --good --train
>
>The incoming message has previously been incorrectly marked as ham.
>--reverse will untrain the classifier and --train will retrain it on
>the message as spam.
>
>With these tools it's straightforward to setup macros in mutt to
>manage false negatives/positives and classify "Unsure" messages.
>
>The modified files can be found at:
>
>http://www.upl.cs.wisc.edu/~hupp/spambayes.tar.gz
>
>hammiefilter requires Optik and the NumericBayes store requires
>Numeric and MaskedArray (and optional part of Numeric).
>
>-Adam
>
>_______________________________________________
>Spambayes mailing list
>Spambayes@python.org
>http://mail.python.org/mailman/listinfo/spambayes
>
>
- Tim
www.fourstonesExpressions.com 




More information about the Spambayes mailing list