[spambayes-dev] 1070 spam, 1 false positive

Tim Peters tim.one at comcast.net
Fri Jun 20 00:40:36 EDT 2003


[Greg Ward]
> Gee, I don't think I told the spambayes-dev crowd that I switched
> mail.python.org over to Spambayes last weekend.  Anyways, so far it
> has rejected 1070 spam messages, and we just got our first false
> positive today:
>
> """
> Return-Path: <*CENSORED*@aol.com>

spambayes considers all email from AOL to be spam, you know <wink>.

> Envelope-To: Tutor at python.org
> Received: from imo-d04.mx.aol.com ([205.188.157.36])
>         by mail.python.org with esmtp (Exim 4.05)
>         id 19Sp8l-000636-00
>         for Tutor at python.org; Wed, 18 Jun 2003 22:27:07 -0400
> Received: from *CENSORED*@aol.com
>         by imo-d04.mx.aol.com (mail_out_v36.3.) id 8.bd.3377bddd
>          (4402) for <Tutor at python.org>; Wed, 18 Jun 2003 22:27:01
> -0400 (EDT) From: *CENSORED*@aol.com
> Message-ID: <bd.3377bddd.2c227975 at aol.com>
> Date: Wed, 18 Jun 2003 22:27:01 EDT
> Subject: tutor
> To: Tutor at python.org
> MIME-Version: 1.0
> Content-Type: multipart/alternative;
>               boundary="part1_bd.3377bddd.2c227975_boundary"
> X-Mailer: 8.0 for Windows sub 6011
> X-Spam-Status: SPAM (default 0.994)
>
> Content-Type: text/plain; charset="US-ASCII"
> Content-Transfer-Encoding: 7bit
>
> UNSUBSCRIBE PLEASE
>
>
> *CENSORED*@aol.com
> """
>
> Oh wait, it was *really* a multipart/alternative message with
> text/plain and text/html; the above is just how mutt renders it for
> me.  But never mind that; there's some interesting stuff going on
> here.
>
> Background: I grouped the 300+ recipient addresses on mail.python.org
> into 18 clusters of similar addresses; tutor at python.org falls into the
> grab-bag "default" cluster (which is what "default 0.994" means in the
> X-Spam-Status header: in the context of the "default" training DB,
> this message scored 0.994).

It would be interesting to see the whole "clue" list -- I'm guessing there
must have been more damaging stuff in the HTML part (spambayes looks at all
text/* parts).

> But note that this message was sent to the wrong address: admin
> requests should never be sent to the list post address!  So, while
> this is certainly not spam in the UBE sense, it *is* undesired mail
> for tutor at python.org.  If I score this message under the "list-misc"
> or "list-owner" training -- which are for *-request and *-owner
> addresses, ie. the *right* place to send this sort of message -- it
> scores 0.02 and 0.05 respectively.

Cool!  In my old python.org tests, you'll recall that the most significant
source of false positives was the same kind of thing:  multipart/alternative
administrivia requests with one- or two- word text/plain parts.  I used a
single database then, though.

> So if this guy had sent his message to the right address, it would
> have been accepted without problems; Spambayes successfully blocked
> him from bothering the whole list with his off-topic request.  Cool!
>
> I guess I should go unsubscribe the poor slob now...

Or sell him penis reduction pills.  I can't imagine any other reason for why
he couldn't see the list admin URL at the bottom of every email he got from
the Tutor list!

takes-one-to-know-one-ly y'rs  - tim




More information about the spambayes-dev mailing list