[Spambayes] How does Spambayes deal with those "random" words in
spam?
Parzival
parz at shaw.ca
Fri Nov 7 12:38:21 EST 2003
Lots of spam contains "random" words in the subject and visible or hidden in
the message body. Since each spam contains different random strings, these
words are very likely not to re-appear in subsequent spam. Does this reduce
the effectiveness of the classification?
A human seeing such a message with garbage words would immediatly recognize it
as spam. Could the classfier be extended to assign higher spam ratings to
messages containing a large amount of "words which have never been seen"?
Possibly a user could pre-seed the classifier with a dictionary of words in
his/her language and/or jargon.
-- Parzival
More information about the Spambayes
mailing list