[Spambayes] Training on individual messages

Sat Nov 16 18:05:32 2002

Hi Paul,

> I wonder, though - is this the right thing to do? Should Hammie be
> growing more and more options (at the back of my mind is the
> possibility of an "unlearn" option, needed if a message gets
> misclassified) or should these sorts of things be split out into
> separate utilities?

They should be in a shared module IMHO, and you're right about this:

> There's been some messages recently about some form of "Corpus" class
> - is that going to address any of this?

Yes - Tim Stone's Corpus class, which he's just committed, encapsulates 
a corpus of emails, and lets you set up automatic training when 
adding/removing/moving messages.  So for instance, you create a Spam 
corpus, attach a Trainer object to it, and call addMessage - that adds 
the message to the corpus, and trains on that message as Spam.  Removing 
the message untrains it.  pop3proxy.py is now using this for a web-based 
training interface, which I'm hoping to commit in the next couple of 
days.

-- 
Richie Hindle
richie@entrian.com