Graham's spam filter

Erik Max Francis max at alcyone.com
Thu Sep 5 03:01:42 EDT 2002


Aaron Swartz wrote:

> I've been using bogofilter[1], Eric Raymond's Graham-derived spam
> filter which threw away base64-encoded data and 90% of all spam that
> got past the filter was base64-encoded. Therefore, I think that base64
> content really needs to be decoded. I wrote a base64-decoding filter
> in Python for it and the problem has gone away.

Indeed.  I've been finding very much the same thing with my rule-based
filter; about 90% of the spam that's getting through is base64 encoded. 
I haven't yet taken the next step of automatically decoding the base64
text parts (and then just processing that), but as you have discovered
it is an obvious solution to the obvious problem.

-- 
 Erik Max Francis / max at alcyone.com / http://www.alcyone.com/max/
 __ San Jose, CA, US / 37 20 N 121 53 W / ICQ16063900 / &tSftDotIotE
/  \ There is nothing so subject to the inconstancy of fortune as war.
\__/ Miguel de Cervantes
    Church / http://www.alcyone.com/pyos/church/
 A lambda calculus explorer in Python.



More information about the Python-list mailing list