[spambayes-dev] lowest scoring message isn't always "best" onetotrain on

Skip Montanaro skip at pobox.com
Mon Jan 19 16:03:59 EST 2004

    Kenny> Maybe a closer approximation to this would be to look for the
    Kenny> message that causes the greatest increase in the mean spam score
    Kenny> of the remaining messages.

I've started calculating the delta mean as well as the number of messages
pushed into spam territory.  Just eyeballing a plot of just over 100 pairs
of (mean diff, # new spams) suggests there's a weak correlation between the
two variables.

I'll probably play with it a bit more and check the script into the contrib
section so other people can play with it.


More information about the spambayes-dev mailing list