[Spambayes] test sets?
Anthony Baxter
anthony@interlink.com.au
Sat, 07 Sep 2002 13:38:51 +1000
> > Note that header names are case insensitive, so this one's no
> > different than "MIME-Version:". Similarly other headers in your list.
>
> Ignoring case here may or may not help; that's for experiment to decide.
> It's plausible that case is significant, if, e.g., a particular spam mailing
> package generates unusual case, or a particular clueless spammer
> misconfigures his package.
I found it made no difference for my testing.
> The brilliance of Anthony's "just count them" scheme is that it requires no
> thought, so can't be fooled <wink>. Header lines that are evenly
> distributed across spam and ham will turn out to be worthless indicators
> (prob near 0.5), so do no harm.
zactly. I started off doing clever clever things, and, as always with
this stuff, found that stupid with a rock beats smart with scissors,
every time.
--
Anthony Baxter <anthony@interlink.com.au>
It's never too late to have a happy childhood.