[spambayes-dev] A URL experiment

Sun Jan 4 21:37:50 EST 2004

Here are my current results with Skip's latest patch; "url" is the same as
"base" except with the addition of

x-pick_apart_urls: True

bases -> urls
-> <stat> tested 342 hams & 94 spams against 3078 hams & 846 spams
<19 repetitions deleted>

false positive percentages
    0.292  0.292  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.292  0.292  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.292  0.292  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied

won   0 times
tied 10 times
lost  0 times

total unique fp went from 3 to 3 tied
mean fp % went from 0.0877192982457 to 0.0877192982457 tied

false negative percentages
    2.128  2.128  tied
    0.000  0.000  tied
    0.000  0.000  tied
    1.064  1.064  tied
    2.128  2.128  tied
    2.128  2.128  tied
    2.128  2.128  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied

won   0 times
tied 10 times
lost  0 times

total unique fn went from 9 to 9 tied
mean fn % went from 0.957446808511 to 0.957446808511 tied

ham mean                     ham sdev
   0.51    0.51   +0.00%        5.96    5.96   +0.00%
   0.12    0.12   +0.00%        1.08    1.09   +0.93%
   0.44    0.44   +0.00%        4.55    4.55   +0.00%
   0.39    0.39   +0.00%        5.59    5.59   +0.00%
   0.49    0.49   +0.00%        4.58    4.60   +0.44%
   0.84    0.85   +1.19%        6.12    6.18   +0.98%
   0.47    0.47   +0.00%        5.60    5.60   +0.00%
   0.34    0.34   +0.00%        3.15    3.15   +0.00%
   0.20    0.20   +0.00%        2.08    2.08   +0.00%
   0.08    0.08   +0.00%        0.88    0.89   +1.14%

ham mean and sdev for all runs
   0.39    0.39   +0.00%        4.40    4.41   +0.23%

spam mean                    spam sdev
  94.15   94.16   +0.01%       17.84   17.83   -0.06%
  98.85   98.87   +0.02%        4.99    4.94   -1.00%
  98.07   98.34   +0.28%        6.49    5.99   -7.70%
  96.98   96.99   +0.01%       13.46   13.49   +0.22%
  96.21   96.25   +0.04%       15.89   15.83   -0.38%
  94.07   94.07   +0.00%       17.29   17.29   +0.00%
  95.61   95.65   +0.04%       16.66   16.65   -0.06%
  96.62   96.66   +0.04%       11.43   11.16   -2.36%
  99.25   99.27   +0.02%        2.55    2.55   +0.00%
  97.43   97.44   +0.01%        9.85    9.82   -0.30%

spam mean and sdev for all runs
  96.72   96.77   +0.05%       12.88   12.82   -0.47%

ham/spam mean difference: 96.33 96.38 +0.05

filename:    base      url
ham:spam:  3420:940
                   3420:940
fp total:        3       3
fp %:         0.09    0.09
fn total:        9       9
fn %:         0.96    0.96
unsure t:       80      79
unsure %:     1.83    1.81
real cost:  $55.00  $54.80
best cost:  $43.80  $43.00
h mean:       0.39    0.39
h sdev:       4.40    4.41
s mean:      96.72   96.77
s sdev:      12.88   12.82
mean diff:   96.33   96.38
k:            5.57    5.59

It's not hurting <wink>.  Skip, why don't you check this in, so we can try
to make testing easier for others?  I'm fine with making it the default
behavior, provided we get decent test results from more people.

[& Skip tests bigrams]
> ...
>     false negative percentages
>         7.874  6.299  won    -20.00%
>         6.299  4.724  won    -25.00%
>         9.449  6.299  won    -33.34%
>         9.449  5.512  won    -41.67%
>         10.236  4.724  won    -53.85%
>         5.512  1.575  won    -71.43%
>         7.087  5.512  won    -22.22%
>         5.556  5.556  tied
>         7.937  7.937  tied
>         8.661  2.362  won    -72.73%
>
>     won   8 times
>     tied  2 times
>     lost  0 times

That's a clear significant win for you , eh?  I'm a little baffled by my
results.  In real life day-to-day use, bigrams are doing great for me, under
mistake-and-unsure training + artificially forcing balance by "random
eyeball" selection.  But CV testing shows a very small improvement (under
randomized TOE):

bayes\testtools>\python23\python cmp.py bases bis
bases -> bis
-> <stat> tested 342 hams & 94 spams against 3078 hams & 846 spams
<19 repetitions deleted>

false positive percentages
    0.292  0.292  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.292  0.292  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.292  0.000  won   -100.00%
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied

won   1 times
tied  9 times
lost  0 times

total unique fp went from 3 to 2 won    -33.33%
mean fp % went from 0.0877192982457 to 0.0584795321638 won    -33.33%

false negative percentages
    2.128  2.128  tied
    0.000  0.000  tied
    0.000  0.000  tied
    1.064  1.064  tied
    2.128  2.128  tied
    2.128  1.064  won    -50.00%
    2.128  2.128  tied
    0.000  0.000  tied
    0.000  0.000  tied
    0.000  0.000  tied

won   1 times
tied  9 times
lost  0 times

total unique fn went from 9 to 8 won    -11.11%
mean fn % went from 0.957446808511 to 0.851063829787 won    -11.11%

ham mean                     ham sdev
   0.51    0.48   -5.88%        5.96    5.96   +0.00%
   0.12    0.20  +66.67%        1.08    1.76  +62.96%
   0.44    0.49  +11.36%        4.55    4.49   -1.32%
   0.39    0.43  +10.26%        5.59    5.79   +3.58%
   0.49    0.57  +16.33%        4.58    5.28  +15.28%
   0.84    0.75  -10.71%        6.12    5.54   -9.48%
   0.47    0.31  -34.04%        5.60    3.59  -35.89%
   0.34    0.52  +52.94%        3.15    4.77  +51.43%
   0.20    0.21   +5.00%        2.08    2.26   +8.65%
   0.08    0.04  -50.00%        0.88    0.52  -40.91%

ham mean and sdev for all runs
   0.39    0.40   +2.56%        4.40    4.38   -0.45%

spam mean                    spam sdev
  94.15   93.92   -0.24%       17.84   18.18   +1.91%
  98.85   98.04   -0.82%        4.99    8.00  +60.32%
  98.07   97.66   -0.42%        6.49    9.05  +39.45%
  96.98   96.98   +0.00%       13.46   13.56   +0.74%
  96.21   95.06   -1.20%       15.89   17.58  +10.64%
  94.07   94.06   -0.01%       17.29   17.26   -0.17%
  95.61   95.65   +0.04%       16.66   16.39   -1.62%
  96.62   96.85   +0.24%       11.43   10.39   -9.10%
  99.25   98.74   -0.51%        2.55    7.78 +205.10%
  97.43   96.83   -0.62%        9.85   11.72  +18.98%

spam mean and sdev for all runs
  96.72   96.38   -0.35%       12.88   13.66   +6.06%

ham/spam mean difference: 96.33 95.98 -0.35

filename:     base      bi
ham:spam:  3420:940
                   3420:940
fp total:        3       2
fp %:         0.09    0.06
fn total:        9       8
fn %:         0.96    0.85
unsure t:       80      84
unsure %:     1.83    1.93
real cost:  $55.00  $44.80
best cost:  $43.80  $39.40
h mean:       0.39    0.40
h sdev:       4.40    4.38
s mean:      96.72   96.38
s sdev:      12.88   13.66
mean diff:   96.33   95.98
k:            5.57    5.32