[Spambayes] Mixed case words in heading

Skip Montanaro skip at pobox.com
Sun Apr 13 18:10:18 EDT 2003


    Jan> My original question was whether mixed case should be penalized:

It's easy enough to tweak the spambayes tokenizer to generate a synthetic
token for unusually capitalized words.  Then, you don't assign a penalty to
it, but let the classifier decide if it is a hammy or spammy (or neither)
clue.

A weird idea just crossed my mind.  Has anyone ever tested the performance
of the system using only synthetic tokens, no real content?

Skip



More information about the Spambayes mailing list