[spambayes-dev] Another incremental training idea...
Skip Montanaro
skip at pobox.com
Thu Jan 15 08:50:22 EST 2004
Toby> If Im reading this right, my 7:1 imbalance doesnt hurt me.
Toby> filename: unbal bal1 bal2 bal3
Toby> ham:spam: 14560:1992 1992:1992
Toby> 1992:1992 1992:1992
Toby> fp total: 0 0 1 0
Toby> fp %: 0.00 0.00 0.05 0.00
Toby> fn total: 12 6 8 6
Toby> fn %: 0.60 0.30 0.40 0.30
Toby> unsure t: 102 21 23 29
Toby> unsure %: 0.62 0.53 0.58 0.73
Toby> real cost: $32.40 $10.20 $22.60 $11.80
Toby> best cost: $27.60 $7.00 $9.80 $8.60
Toby> h mean: 0.11 0.23 0.30 0.32
Toby> h sdev: 1.89 2.47 3.46 3.26
Toby> s mean: 96.93 99.06 99.04 99.02
Toby> s sdev: 12.11 6.88 6.98 7.21
Toby> mean diff: 96.82 98.83 98.74 98.70
Toby> k: 6.92 10.57 9.46 9.43
It doesn't seem to have a negative effect on false positives, but it looks
like you will get roughly double the number of false negatives and 4-5x as
many unsures.
Skip
More information about the spambayes-dev
mailing list