[spambayes-dev] Another piece of anecdotal evidence

Wed Jan 14 14:35:11 EST 2004

Hi,

>    >> How do you plan to find those mistrained messages?
>
>    Alex> As part of my nightly retrain, I'm going to make it score each
>    Alex> message (with the fully trained DB) and sort them into 6
>    Alex> directories for each month: {ham,spam}{positive,unsure,negative}.
>    Alex> Flipping through the hampositive directory for each month should
>    Alex> make it fairly easy to spot the problems...
>
>I'm still confused.  You've got a spam mistrained as ham.  Are you
>suggesting that you expect that scoring that message against your training
>database (which includes features gleaned from that message) will reveal
>that it is something other than ham?  
>
I wrote a little scrit that look at the message training header
Then using the current database I was reclassifying the msg.
Then I checked if the training header and the classifying header were 
the same.

If the message was misclassified then usually it was showing as unsure.
My database was containing thousand of ham and spam.

Remi

-- 
/"\
\ /
 X   ASCII Ribbon Campaign
/ \  Against HTML Email