[Mailman-Developers] Google Summer of Code - Spam Defense

Wed Apr 2 13:29:10 CEST 2008

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mar 27, 2008, at 11:26 AM, Timo Wingender wrote:

> I like to participate in Google Summer of Code this year. One possible
> Project for me is to implement some Spam Defense in Mailman. I think
> development for Mailman should be possible through Python Software
> Foundation. Am I right with this?

Hi Timo,

I think you could do this through either the PSF or the GNU project.   
It's a good project to pursue, and either organization seems  
appropriate.

> I administrate a Mailman installation with about 100 lists and  
> thousands
> of users and I moderate most of the Lists. I think the biggest Problem
> of Mailman is the lack in spam defense. Discard messages from  
> nonmembers
> is no option on most lists.
>
> Some time ago I began some modification of Mailman. But I never  
> finished
> it. The first action is to integrate support for SpamAssassin in
> Mailman. Therefor I wrote a python class spamc which connects to  
> spamd.
> This gives the possibility to scan all incoming Mail.
> Further ideas for spam defense are:
> - - Add the possibility to scan all messages form nonmembers half an
> hour later again before mark them as hold. This is because most of the
> mails which are not recognized as spam are to new. The servers are not
> in any blacklist at time of incoming.
> - - Train the bayes filter from Mailman. Forward all accepted Messages
> to SpamAssassin to learn them as ham. The autolearn feature of SA
> doesn't work for me. It learns to much false negatives.

A couple of thoughts, and then I'm going to try to respond to other  
messages in this thread.  While I agree that it's generally much  
better to do spam detection upstream of Mailman, i.e. in the MTA, I  
think there is still some benefit to developing several hooks in  
Mailman to something like SpamAssassin.  One of course would be a  
fairly simple handler to recognize SA headers and do the appropriate  
thing.  Your idea of having a call out to SA to scan the message is  
valid too though, because I don't think everybody is able to hook it  
into their MTA, for whatever reason.  This wouldn't be on by default,  
but it should be an option.

Several years ago, myself and a few others worked on some code to hook  
Mailman's approval mechanism into Spambayes training.  It worked  
moderately well, but not good enough to ever add to Mailman proper.   
I'm sure the patches are still on SourceForge and might even still  
apply to MM 2.1.  It's an interesting idea that you might like to dust  
off and see if you can get working for SA.

> This are my ideas so far. Is this welcome in Mailman and is it enough
> for an GSoC Project? Where would it be best? 2.1.11? 2.2.0? 3.0.0?

I wouldn't do it for 2.1 since I'd like to be very strict about "no  
new features" for 2.1.  It would probably be most useful for people in  
2.2, but I hope that you'll also consider looking at 3.0 because I  
think the architecture will be more amenable to these ideas.  E.g.  
you'd be able to reject spammy messages during the LMTP phase.

I'm planning on releasing Mailman 3 alpha 1 in the next day or so.   
It's basically ready, but I have to fix an annoying setuptools problem.

Cheers,
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkfzbgcACgkQ2YZpQepbvXFAMQCfTQZ/Ef6XCHGHUjMu9vVPgqoZ
7l8An3FotgRC+CeKbCcu3tjk6oxuvbyu
=/Oay
-----END PGP SIGNATURE-----