[spambayes-dev] A URL experiment
Tim Peters
tim.one at comcast.net
Sun Jan 4 21:37:50 EST 2004
Here are my current results with Skip's latest patch; "url" is the same as
"base" except with the addition of
x-pick_apart_urls: True
bases -> urls
-> <stat> tested 342 hams & 94 spams against 3078 hams & 846 spams
<19 repetitions deleted>
false positive percentages
0.292 0.292 tied
0.000 0.000 tied
0.000 0.000 tied
0.292 0.292 tied
0.000 0.000 tied
0.000 0.000 tied
0.292 0.292 tied
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
won 0 times
tied 10 times
lost 0 times
total unique fp went from 3 to 3 tied
mean fp % went from 0.0877192982457 to 0.0877192982457 tied
false negative percentages
2.128 2.128 tied
0.000 0.000 tied
0.000 0.000 tied
1.064 1.064 tied
2.128 2.128 tied
2.128 2.128 tied
2.128 2.128 tied
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
won 0 times
tied 10 times
lost 0 times
total unique fn went from 9 to 9 tied
mean fn % went from 0.957446808511 to 0.957446808511 tied
ham mean ham sdev
0.51 0.51 +0.00% 5.96 5.96 +0.00%
0.12 0.12 +0.00% 1.08 1.09 +0.93%
0.44 0.44 +0.00% 4.55 4.55 +0.00%
0.39 0.39 +0.00% 5.59 5.59 +0.00%
0.49 0.49 +0.00% 4.58 4.60 +0.44%
0.84 0.85 +1.19% 6.12 6.18 +0.98%
0.47 0.47 +0.00% 5.60 5.60 +0.00%
0.34 0.34 +0.00% 3.15 3.15 +0.00%
0.20 0.20 +0.00% 2.08 2.08 +0.00%
0.08 0.08 +0.00% 0.88 0.89 +1.14%
ham mean and sdev for all runs
0.39 0.39 +0.00% 4.40 4.41 +0.23%
spam mean spam sdev
94.15 94.16 +0.01% 17.84 17.83 -0.06%
98.85 98.87 +0.02% 4.99 4.94 -1.00%
98.07 98.34 +0.28% 6.49 5.99 -7.70%
96.98 96.99 +0.01% 13.46 13.49 +0.22%
96.21 96.25 +0.04% 15.89 15.83 -0.38%
94.07 94.07 +0.00% 17.29 17.29 +0.00%
95.61 95.65 +0.04% 16.66 16.65 -0.06%
96.62 96.66 +0.04% 11.43 11.16 -2.36%
99.25 99.27 +0.02% 2.55 2.55 +0.00%
97.43 97.44 +0.01% 9.85 9.82 -0.30%
spam mean and sdev for all runs
96.72 96.77 +0.05% 12.88 12.82 -0.47%
ham/spam mean difference: 96.33 96.38 +0.05
filename: base url
ham:spam: 3420:940
3420:940
fp total: 3 3
fp %: 0.09 0.09
fn total: 9 9
fn %: 0.96 0.96
unsure t: 80 79
unsure %: 1.83 1.81
real cost: $55.00 $54.80
best cost: $43.80 $43.00
h mean: 0.39 0.39
h sdev: 4.40 4.41
s mean: 96.72 96.77
s sdev: 12.88 12.82
mean diff: 96.33 96.38
k: 5.57 5.59
It's not hurting <wink>. Skip, why don't you check this in, so we can try
to make testing easier for others? I'm fine with making it the default
behavior, provided we get decent test results from more people.
[& Skip tests bigrams]
> ...
> false negative percentages
> 7.874 6.299 won -20.00%
> 6.299 4.724 won -25.00%
> 9.449 6.299 won -33.34%
> 9.449 5.512 won -41.67%
> 10.236 4.724 won -53.85%
> 5.512 1.575 won -71.43%
> 7.087 5.512 won -22.22%
> 5.556 5.556 tied
> 7.937 7.937 tied
> 8.661 2.362 won -72.73%
>
> won 8 times
> tied 2 times
> lost 0 times
That's a clear significant win for you , eh? I'm a little baffled by my
results. In real life day-to-day use, bigrams are doing great for me, under
mistake-and-unsure training + artificially forcing balance by "random
eyeball" selection. But CV testing shows a very small improvement (under
randomized TOE):
bayes\testtools>\python23\python cmp.py bases bis
bases -> bis
-> <stat> tested 342 hams & 94 spams against 3078 hams & 846 spams
<19 repetitions deleted>
false positive percentages
0.292 0.292 tied
0.000 0.000 tied
0.000 0.000 tied
0.292 0.292 tied
0.000 0.000 tied
0.000 0.000 tied
0.292 0.000 won -100.00%
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
won 1 times
tied 9 times
lost 0 times
total unique fp went from 3 to 2 won -33.33%
mean fp % went from 0.0877192982457 to 0.0584795321638 won -33.33%
false negative percentages
2.128 2.128 tied
0.000 0.000 tied
0.000 0.000 tied
1.064 1.064 tied
2.128 2.128 tied
2.128 1.064 won -50.00%
2.128 2.128 tied
0.000 0.000 tied
0.000 0.000 tied
0.000 0.000 tied
won 1 times
tied 9 times
lost 0 times
total unique fn went from 9 to 8 won -11.11%
mean fn % went from 0.957446808511 to 0.851063829787 won -11.11%
ham mean ham sdev
0.51 0.48 -5.88% 5.96 5.96 +0.00%
0.12 0.20 +66.67% 1.08 1.76 +62.96%
0.44 0.49 +11.36% 4.55 4.49 -1.32%
0.39 0.43 +10.26% 5.59 5.79 +3.58%
0.49 0.57 +16.33% 4.58 5.28 +15.28%
0.84 0.75 -10.71% 6.12 5.54 -9.48%
0.47 0.31 -34.04% 5.60 3.59 -35.89%
0.34 0.52 +52.94% 3.15 4.77 +51.43%
0.20 0.21 +5.00% 2.08 2.26 +8.65%
0.08 0.04 -50.00% 0.88 0.52 -40.91%
ham mean and sdev for all runs
0.39 0.40 +2.56% 4.40 4.38 -0.45%
spam mean spam sdev
94.15 93.92 -0.24% 17.84 18.18 +1.91%
98.85 98.04 -0.82% 4.99 8.00 +60.32%
98.07 97.66 -0.42% 6.49 9.05 +39.45%
96.98 96.98 +0.00% 13.46 13.56 +0.74%
96.21 95.06 -1.20% 15.89 17.58 +10.64%
94.07 94.06 -0.01% 17.29 17.26 -0.17%
95.61 95.65 +0.04% 16.66 16.39 -1.62%
96.62 96.85 +0.24% 11.43 10.39 -9.10%
99.25 98.74 -0.51% 2.55 7.78 +205.10%
97.43 96.83 -0.62% 9.85 11.72 +18.98%
spam mean and sdev for all runs
96.72 96.38 -0.35% 12.88 13.66 +6.06%
ham/spam mean difference: 96.33 95.98 -0.35
filename: base bi
ham:spam: 3420:940
3420:940
fp total: 3 2
fp %: 0.09 0.06
fn total: 9 8
fn %: 0.96 0.85
unsure t: 80 84
unsure %: 1.83 1.93
real cost: $55.00 $44.80
best cost: $43.80 $39.40
h mean: 0.39 0.40
h sdev: 4.40 4.38
s mean: 96.72 96.38
s sdev: 12.88 13.66
mean diff: 96.33 95.98
k: 5.57 5.32
More information about the spambayes-dev
mailing list