[Spambayes] Proposing to make chi-combining the default

Tim Peters tim.one@comcast.net
Sun Oct 27 02:57:06 2002


[Tim]
>>2'. Change the default ham_cutoff to 0.20 and the default spam_cutoff
>>    to 0.90.

[T. Alexander Popiel]
> I'm slightly surprised at the looseness of 2', but as you say,
> the boundaries aren't all that touchy.

For my own email, and on my large c.l.py test, I use cutoffs of 0.30 and
0.80 with chi-combining very happily, so the suggested defaults are
conservative relative to that.  But they're *just* defaults, and anyone
taking a default too seriously should be shot <wink>.  Certainly, they
should be closer to the endpoints if just starting training.

> I'm all for the above.

Nobody has objected, so I'll make the change next (I already made the other
changes threatened, BTW -- use_mixed_combining is gone, and ditto
ignore_redundant_html).

Anyone wedded to gary-combining, don't panic:  your database is unaffected
by changing this default.  There's no need to retrain it.  If you want to
continue using gary-combining for scoring (again, the remaining combining
schemes have nothing to do with training, they're purely a scoring-time
choice), you'll need to add

[Classifier]
use_gary_combining: True

to your .ini file.