[Spambayes] here's a really tough one, but at least it scored unsure!

Seth Goodman sethg at GoodmanAssociates.com
Fri Nov 25 04:48:47 CET 2005


> From: spambayes-bounces at python.org
> [mailto:spambayes-bounces at python.org]On Behalf Of Tony Meyer
> Sent: Thursday, November 24, 2005 2:39 PM
>
>
> > V P A V X C
> > I r m A a I
> > A o b L n A
> > G z i I a L
> > R a e U x I
> > A c n M   S
> > $69,95     $85,45   $99,95
> > http://armelaurofishruner.tripod.com
> [...]
> > I've got to hand it to these folks for deviousness.  Looks like
> > it came through a trojaned dynamic IP machine in Mexico.
> > And the results...
>
> Do you really think that these messages actually sell anything?

I certainly hope not!  It's been said before that very few people have
gone broke underestimating the intelligence of the American public.


> Don't people receiving it see a bunch of junk?  I've also seen this
> one using CSS rather than tables, BTW.

It look like junk to me, but it made me laugh because it's obvious why
they did it.  The vertical column approach is clever in that they can
keep reordering the columns so you won't see consistent letter
combinations in the rows and can't train on it.  I was satisfied that it
scored unsure and intentionally did not add it to my training set.


>
> To counter these sorts of tricks requires moving into 'eye-space',
> which means building an HTML/CSS renderer, which would be a
> big task, and I'm not convinced it would be that worth it.

Strongly agree.  This is outside the solution space of Spambayes.  This
spam a crude step on the way to passing an image to hide their text
completely.  Of course, then we get HTML tags and a URL, so I'm not
worried.



> I think messages like this are better countered with other
> techniques (that trojaned machine could well have been on a
> blacklist somewhere, for example, or we can check out the
> content of the URL, as the experimental slurping options do).

I tend  to agree.  The trojaned machine's IP was not yet blacklisted
when it hit my MX, but it was soon after, so other people could reject
it.  The included URL's don't always last that long, in which case there
are blacklists for spamvertised URL's like SURBL http://www.surbl.org/.
If the URL is not forcibly taken down, then Spambayes can learn it _and_
SURBL will have it.  I haven't felt the need to use SURBL because
Spambayes works so well as it is, but it's nice to know that it's there
should it be necessary.

--

Seth Goodman



More information about the SpamBayes mailing list