[Spambayes] RE: spam detection via probability - actual results!
Tim Peters
tim.one@comcast.net
Fri, 20 Sep 2002 10:51:14 -0400
[Sjoerd Mullender, using
> > [Classifier]
> > use_robinson_probability: True
> > max_discriminators: 150
> > hambias: 1.0
> > [TestDriver]
> > spam_cutoff: 0.50
>
> Here are my results. I also have
> [Tokenizer]
> count_all_header_lines: True
> mine_received_headers: True
> in both runs.
running an 8-fold cv with 800 spam and 5600 ham, sees his first false
positive (for a ham scoring 0.525), and what may be a reduction in f-n rate
(4 runs tie, on 1 run 2 false negatives go away, on 3 runs 1 f-n goes away).
Trying again with the default hambias 2.0, f-p is unaffected, and f-n loses
on 5 runs of 8, ties on 3]
That's very helpful -- thanks. At this point I think it's best to stop
trying to "hill climb" into Gary's approach one little change at a time --
I'll implement all his suggestions at once (he's got good reasons for why
they *should* be done together).