Third result ... RE: [Spambayes] First result from Gary Robinson'sideas

Tim Peters tim.one@comcast.net
Thu, 19 Sep 2002 16:00:09 -0400


[Sjoerd Mullender]
> >     [Classifier]
> >     use_robinson_probability: True
> >
> >     [TestDriver]
> >     spam_cutoff: 0.50

> Here are my results.  run1 was default, run2 with the above settings.
>
> By the way, I'm using runtest.sh, so I guess I'm the number two and
> not Guido.  :-)
>
> run1s -> run2s
> -> <stat> tested 159 hams & 159 spams against 636 hams & 636 spams
...
>
> false positive percentages
>     0.000  0.000  tied
>     0.000  0.000  tied
>     0.000  0.000  tied
>     0.000  0.000  tied
>     0.000  0.000  tied
>
> won   0 times
> tied  5 times
> lost  0 times
>
> total unique fp went from 0 to 0 tied
> mean fp % went from 0.0 to 0.0 tied
>
> false negative percentages
>     2.516  2.516  tied
>     0.000  0.000  tied
>     1.258  1.258  tied
>     2.516  2.516  tied
>     1.258  1.258  tied
>
> won   0 times
> tied  5 times
> lost  0 times
>
> total unique fn went from 12 to 12 tied
> mean fn % went from 1.50943396226 to 1.50943396226 tied

Thank you, Sjoerd.  Just noting that (according to the "after" histograms),
boosting spam_cutoff to 0.525 would have left your f-p rate unchanged but
added  2 false negatives to your f-n total.  In the other direction,
lowering it to 0.475 would have saved you 2 false negatives, but added your
first false positive.  The separation near 0.50 is clearly touchy, but it
remains pretty amazing how close 0.50 is to ideal for everyone who has tried
this.