[Spambayes] RE: spam detection via probability - actual results!

Tim Peters tim.one@comcast.net
Fri, 20 Sep 2002 10:51:14 -0400


[Sjoerd Mullender, using

> > [Classifier]
> > use_robinson_probability: True
> > max_discriminators: 150
> > hambias: 1.0
> > [TestDriver]
> > spam_cutoff: 0.50
>
> Here are my results.  I also have
> [Tokenizer]
> count_all_header_lines: True
> mine_received_headers: True
> in both runs.

running an 8-fold cv with 800 spam and 5600 ham, sees his first false
positive (for a ham scoring 0.525), and what may be a reduction in f-n rate
(4 runs tie, on 1 run 2 false negatives go away, on 3 runs 1 f-n goes away).
Trying again with the default hambias 2.0, f-p is unaffected, and f-n loses
on 5 runs of 8, ties on 3]

That's very helpful -- thanks.  At this point I think it's best to stop
trying to "hill climb" into Gary's approach one little change at a time --
I'll implement all his suggestions at once (he's got good reasons for why
they *should* be done together).