[Spambayes] spam designed to defeat Bayesian filters

Wed Nov 19 14:54:28 EST 2003

[Seth Goodman]
> Attached is an email (along with resulting spam clues) that
> apparently was designed specifically to get past Bayesian filters.

The purpose of random text is to defeat fingerprinting schemes.  The best
random crap of this kind can hope to do against our kind of filter is push
it down to unsure; although for some people, some of the time, it will score
as ham -- as Skip said recently, unless they can access your training
database, to find out what is and isn't hammy and spammy to *you*, random
text is just pissing in the wind.

> I believe this was mentioned before as the "white on white" HTML
> problem.  The email has a large number of legitimate words, probably
> randomly picked from a dictionary, in a section where the font color
> is almost white on a white background.  There is a little snippet of
> HTML at the end that contains my email address.  I don't know what it
> does but I don't like the looks of it.

It's probably "a beacon".  If your email reader (like Outlook <wink>)
normally renders HTML, and is willing to fetch images over the web, the
sender can arrange to send *back* any kind of info it wants in the URL.
Like the email address the spam was sent to.  When the spammer's site gets
the URL request, as a side effect it gets the email address it embedded in
the URL too.  Then the spammer knows that address is "live", and can sell it
to other spammers.

> The message appears blank, unless you look very closely and then
> look at the HTML source.  Not only does this message slip through the
> classifier as ham,

It slipped thru yours today, but it won't slip through everyone's today, and
it won't even slip through yours on some other day -- collections of random
words get pretty much random scores, but tending toward 50 on average.

> but training on this message as spam would probably reduce the
> effectiveness of the classifier.

I've trained on such things without apparent harm.  Very few non-hapax words
are purely hammy or purely spammy, and this kind of scheme is robust against
that kind of ambiguity (at least when well-trained; a mistake-driven
training approach leaves it brittler, but that's what I do these days, and I
still haven't noticed harm from training on such things).

> My questions are:
>
> 1) What is this thing?  Does it harvest addresses when rendered?

That appears to be its intent, yes.

> 2) Are there any approaches that have been discussed to ignore the
> "almost white" text during parsing?

Sure, but it requires semantic analysis deeper than we do, and would have to
start with doing real parsing of HTML.  That would be much more expensive
than what we do now.

> 3) Is anything in the works for this exploit?

Not that I know of.