[Spambayes] Beyond Spambayes

Wed Feb 22 19:20:48 CET 2006

   From: "Seth Goodman" <sethg at GoodmanAssociates.com>

   By employing a variety of rejection tools (i.e. DNSBL's for the
   connecting IP plus HELO name and rDNS heuristics), most of the load can
   be rejected during the envelope phase of SMTP.  For the ones that make
   it past the envelope, it is still possible to do the remaining content
   checks during the DATA phase and make the sender wait before confirming
   acceptance with a 250 code.  Many people argue that spammers often abuse
   pipelining and dump the whole message after the DATA command then
   disconnect, not waiting around for the acceptance.  Any MTA behaving
   that way can be added to a local DNSBL so you don't talk to them next
   time.  

A problem is that with the rise of botnet armies, we're the majority
of spam actually coming from bots, not "bulletproof" servers or open
relays.  That is, a majority of spam is identical spam (indicating it
was sent at the behest of one individual), but was sent from a large
number of different sources via different paths.  In short, a
"perfect" RBL (one that had 100% perfect input and propagated it at
superluminal velocity) would still only get about 40% of the spammers.

   Similarly, there are a number of heuristics that can catch this
   type of spammer early:  put in a delay after the connection request
   before you send the banner.  Anyone who doesn't wait for the end of
   banner can be safely disconnected and blacklisted for the future.  If
   you want to perform a public service, tarpit them instead of merely
   rejecting and blacklisting.  

I was under the impression that a pipelining MTA doesn't care what happens
after the port opens successfully.  In that case, tarpitting won't
matter; they're not waiting for the ACK packets.

It's all one big mess, if you ask me.  :(

Adding an answerback at the end of DATA (like three-phase commit) would
have been a nice thing, but it's a little late for that.

     -Bill Yerazunis