[Spambayes] Spam Detection Adds Custom Header Entry
Dennis W. Bulgrien
dbulgrien at vcsd.com
Thu Jan 8 09:26:13 EST 2004
My e-mail server uses a spam filter for the incoming email accounts. This
filter adds a string into the header, X-Spam-Score, if it finds spam and adds an
attachment explaining why it considered the email to be spam. Spambayes
(Outlook plug-in) apparently ignores this [unknown] header entry; or maybe it
isn't even given the opportunity to parse them. I also don't see Spambayes
clues that score based on the presence of attachments, the name of the
attachments, or its contents. In the present case, it would be helpful as all
attachment names are identical. May it be known, however, that with these,
Spambayes is still doing a very good job of catching them.
________________________________________
Example of header:
X-Mailer: Jollymail v2.5
...
X-Spam-Score: 8.633 (********)
BIZ_TLD,FROM_ENDS_IN_NUMS,HTML_30_40,HTML_FONT_INVISIBLE,HTML_MESSAGE,MIME_BOUND
_NEXTPART,MIME_HTML_NO_CHARSET,MIME_HTML_ONLY,MIME_HTML_ONLY_MULTI,MIME_MISSING_
BOUNDARY
X-Scanned-By: MIMEDefang 2.39
...
Content-Type: text/plain; name="SpamAssassinReport.txt"
Content-Disposition: inline; filename="SpamAssassinReport.txt"
...
X-Mailer: MIME-tools 5.411 (Entity 5.404)
________________________________________
Here's an example of the attachment, SpamAssassinReport.txt:
Spam detection software, running on the system... has
identified this incoming email as possible spam...
Content preview: URI:http://xn--zj4b74h0vcg1k.com/index.html
...
Content analysis details: (31.4 points, 4.0 required)
pts rule name description
---- ---------------------- --------------------------------------------------
0.3 NO_REAL_NAME From: does not include a real name
1.0 FROM_ENDS_IN_NUMS From: ends in numbers
3.7 KOREAN_UCE_SUBJECT Subject: contains Korean unsolicited email tag
0.4 HTML_60_70 BODY: Message is 60% to 70% HTML
3.2 CHARSET_FARAWAY BODY: Character set indicates a foreign language
0.7 HTML_TAG_BALANCE_TABLE BODY: HTML is missing "table" close tags
0.1 HTML_FONTCOLOR_UNKNOWN BODY: HTML font color is unknown to us
0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
0.2 HTML_MESSAGE BODY: HTML included in message
0.3 HTML_FONT_BIG BODY: HTML has a big font
1.9 HTML_IMAGE_ONLY_04 BODY: HTML: images with 200-400 bytes of words
3.6 SUBJ_ILLEGAL_CHARS Subject contains too many raw illegal characters
4.3 FROM_ILLEGAL_CHARS From contains too many raw illegal characters
1.8 DATE_IN_PAST_96_XX Date: is 96 hours or more before Received: date
3.7 MSGID_FROM_MTA_SHORT Message-Id was added by a relay
2.2 FROM_ALL_NUMS From an address that is all numbers (non-phone)
1.1 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag
2.5 MIME_CHARSET_FARAWAY MIME character set indicates foreign language
More information about the Spambayes
mailing list