[Spambayes] experimental_ham_spam_imbalance_adjustment result
Mark Hammond
mhammond at skippinet.com.au
Mon Mar 10 10:42:43 EST 2003
Here are my current results on the imbalance option. Interestingly, my
initial "-n2" results looked better than my "-n 5" results below.
FWIW, Outlook users should remember there is an "export.py" script in the
addin directory. This will export your ham and spam into the
"spambayes\testdata\Data" directory, which is the default place the test
scripts the test tools use. Just run this from the command line.
And for everyone else, once you have a "Data" directory, running the tests
means:
* Create testtools\bayescustomize.ini with the options you want to test
* run "timtest.py -n 2 > result1.txt"
* run "rates result1.txt" - this creates "result1s.txt"
* Repeat the above, changing the options, and redirecting to
"result2.txt" and getting "result2s.txt" as final output.
* Run "cmp.py result1s.txt result2s.txt"
Well - if it *doesn't* mean that, then you can ignore my results too <wink>.
My results below are for "-n 5".
Mark.
\temp\imbalance_falses.txt -> \temp\imbalance_trues.txt
-> <stat> tested 412 hams & 1004 spams against 429 hams & 1019 spams
-> <stat> tested 440 hams & 1076 spams against 429 hams & 1019 spams
-> <stat> tested 397 hams & 1054 spams against 429 hams & 1019 spams
-> <stat> tested 477 hams & 1056 spams against 429 hams & 1019 spams
-> <stat> tested 429 hams & 1019 spams against 412 hams & 1004 spams
-> <stat> tested 440 hams & 1076 spams against 412 hams & 1004 spams
-> <stat> tested 397 hams & 1054 spams against 412 hams & 1004 spams
-> <stat> tested 477 hams & 1056 spams against 412 hams & 1004 spams
-> <stat> tested 429 hams & 1019 spams against 440 hams & 1076 spams
-> <stat> tested 412 hams & 1004 spams against 440 hams & 1076 spams
-> <stat> tested 397 hams & 1054 spams against 440 hams & 1076 spams
-> <stat> tested 477 hams & 1056 spams against 440 hams & 1076 spams
-> <stat> tested 429 hams & 1019 spams against 397 hams & 1054 spams
-> <stat> tested 412 hams & 1004 spams against 397 hams & 1054 spams
-> <stat> tested 440 hams & 1076 spams against 397 hams & 1054 spams
-> <stat> tested 477 hams & 1056 spams against 397 hams & 1054 spams
-> <stat> tested 429 hams & 1019 spams against 477 hams & 1056 spams
-> <stat> tested 412 hams & 1004 spams against 477 hams & 1056 spams
-> <stat> tested 440 hams & 1076 spams against 477 hams & 1056 spams
-> <stat> tested 397 hams & 1054 spams against 477 hams & 1056 spams
-> <stat> tested 412 hams & 1004 spams against 429 hams & 1019 spams
-> <stat> tested 440 hams & 1076 spams against 429 hams & 1019 spams
-> <stat> tested 397 hams & 1054 spams against 429 hams & 1019 spams
-> <stat> tested 477 hams & 1056 spams against 429 hams & 1019 spams
-> <stat> tested 429 hams & 1019 spams against 412 hams & 1004 spams
-> <stat> tested 440 hams & 1076 spams against 412 hams & 1004 spams
-> <stat> tested 397 hams & 1054 spams against 412 hams & 1004 spams
-> <stat> tested 477 hams & 1056 spams against 412 hams & 1004 spams
-> <stat> tested 429 hams & 1019 spams against 440 hams & 1076 spams
-> <stat> tested 412 hams & 1004 spams against 440 hams & 1076 spams
-> <stat> tested 397 hams & 1054 spams against 440 hams & 1076 spams
-> <stat> tested 477 hams & 1056 spams against 440 hams & 1076 spams
-> <stat> tested 429 hams & 1019 spams against 397 hams & 1054 spams
-> <stat> tested 412 hams & 1004 spams against 397 hams & 1054 spams
-> <stat> tested 440 hams & 1076 spams against 397 hams & 1054 spams
-> <stat> tested 477 hams & 1056 spams against 397 hams & 1054 spams
-> <stat> tested 429 hams & 1019 spams against 477 hams & 1056 spams
-> <stat> tested 412 hams & 1004 spams against 477 hams & 1056 spams
-> <stat> tested 440 hams & 1076 spams against 477 hams & 1056 spams
-> <stat> tested 397 hams & 1054 spams against 477 hams & 1056 spams
false positive percentages
1.699 1.214 won -28.55%
0.909 0.682 won -24.97%
1.008 0.756 won -25.00%
0.210 0.210 tied
0.932 0.699 won -25.00%
0.682 0.227 won -66.72%
1.008 0.504 won -50.00%
0.000 0.000 tied
0.466 0.233 won -50.00%
0.243 0.243 tied
1.259 0.504 won -59.97%
0.210 0.000 won -100.00%
0.699 0.466 won -33.33%
1.456 0.728 won -50.00%
1.818 1.591 won -12.49%
0.839 0.210 won -74.97%
0.466 0.233 won -50.00%
0.728 0.485 won -33.38%
0.455 0.227 won -50.11%
1.259 0.756 won -39.95%
won 17 times
tied 3 times
lost 0 times
total unique fp went from 40 to 26 won -35.00%
mean fp % went from 0.817290959648 to 0.49835280855 won -39.02%
false negative percentages
0.398 0.498 lost +25.13%
0.093 0.186 lost +100.00%
0.380 0.474 lost +24.74%
0.189 0.189 tied
0.294 0.294 tied
0.000 0.372 lost +(was 0)
0.190 0.285 lost +50.00%
0.379 0.568 lost +49.87%
0.491 0.883 lost +79.84%
0.896 1.195 lost +33.37%
0.664 1.139 lost +71.54%
0.189 0.379 lost +100.53%
0.294 0.393 lost +33.67%
0.498 0.697 lost +39.96%
0.093 0.093 tied
0.189 0.379 lost +100.53%
0.687 1.374 lost +100.00%
1.195 1.295 lost +8.37%
0.651 0.929 lost +42.70%
0.474 0.664 lost +40.08%
won 0 times
tied 3 times
lost 17 times
total unique fn went from 44 to 66 lost +50.00%
mean fn % went from 0.412283315133 to 0.614303438288 lost +49.00%
ham mean ham sdev
3.82 2.97 -22.25% 14.77 12.72 -13.88%
3.31 2.42 -26.89% 13.21 10.90 -17.49%
3.57 2.69 -24.65% 13.66 11.26 -17.57%
4.26 3.20 -24.88% 15.98 13.14 -17.77%
3.37 2.65 -21.36% 14.07 12.03 -14.50%
ham mean and sdev for all runs
3.67 2.79 -23.98% 14.38 12.05 -16.20%
spam mean spam sdev
98.10 96.94 -1.18% 8.44 10.32 +22.27%
97.83 96.47 -1.39% 9.10 11.49 +26.26%
97.63 96.24 -1.42% 10.29 12.59 +22.35%
98.13 96.83 -1.32% 8.23 10.58 +28.55%
96.93 95.49 -1.49% 11.94 14.13 +18.34%
spam mean and sdev for all runs
97.72 96.40 -1.35% 9.71 11.91 +22.66%
ham/spam mean difference: 94.05 93.61 -0.44
More information about the Spambayes
mailing list