[Spambayes] Is there and end to training?

Skip Montanaro skip at pobox.com
Wed May 19 09:11:42 EDT 2004


    Oleg> Is there a criteria, that will tell if the program has been fully
    Oleg> trained?  I get 1,000 e-mails a day and if I have to constantly
    Oleg> train it, it will be just a nuisance.  The page opens up too slow
    Oleg> after I download a few hundred e-mails..
 
    Oleg> Is there an end to the training? At some point I would like to
    Oleg> stop manually assigning ham/spam to each individual e-mail..
 
There are two basic approaches to training: train-on-everything and
train-on-mistakes.  It sounds like you're doing the former.  Perhaps you
should try the latter.  Train-on-mistakes treats outright mistakes (false
positives or false negatives) and unsure messages as "mistakes".  Any
message which is classified correctly is not trained.  Even with such
training, there are some messages which should just remain unsure (at least
in my opinion).  Bounce messages of one sort or another can be particularly
problematic, as they often have large chunks of both highly hammy and highly
spammy content.  I just accept the fact that those sort of messages will
wind up in my unsure box.

You might check out the Spambayes Wiki:

    http://www.entrian.com/sbwiki

Click the TrainingIdeas link for more detail.

Skip




More information about the Spambayes mailing list