[Spambayes] How does Spambayes deal with those "random" words in spam?

Parzival parz at shaw.ca
Fri Nov 7 12:38:21 EST 2003


Lots of spam contains "random" words in the subject and visible or hidden in 
the message body. Since each spam contains different random strings, these 
words are very likely not to re-appear in subsequent spam. Does this reduce 
the effectiveness of the classification?

A human seeing such a message with garbage words would immediatly recognize it 
as spam. Could the classfier be extended to assign higher spam ratings to 
messages containing a large amount of "words which have never been seen"?
Possibly a user could pre-seed the classifier with a dictionary of words in 
his/her language and/or jargon.


-- Parzival



More information about the Spambayes mailing list