[Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

Fri Apr 12 17:22:23 CEST 2013

And this is my another idea, which I am interested to work on

4. My own project idea: Mining the list logs and recognize interesting
patterns for better enhancements (the admin need not have data mining
experience)

We can actually have this integrated to the admin console where the logs
can be accessed, at the same time, some interesting patterns can shown,
along with stats and all (just a basic idea, need to work on this more).
Depending on the detected patterns, the admin may want to change some
settings! Given my experience with IR and Django, I feel this is a
potential GSoC project!

Any suggestions?

On Fri, Apr 12, 2013 at 8:47 PM, Sreyanth <sreyanth at gmail.com> wrote:

> Hi all! Thank you very much for awesome discussion here!
>
> On Fri, Apr 12, 2013 at 1:22 AM, Terri Oda <terri at zone12.com> wrote:
>
>>  On 13-04-11 10:44 AM, Stephen J. Turnbull wrote:
>>
>> 1.  Mailman is the wrong place to do filtering.  It's equally
>>     effective, normally covers more messages, and is somewhat more
>>     efficient in resource usage to do it at the MTA.
>> 2.  Any new algorithms **should** be made available at the MTA level
>>     where they can be best put to use by more people.  This implies
>>     something that either plugs into existing filters (such as
>>     spamassassin) or MTAs (ie, milters) rather than a Handler.
>> 3.  Adapting existing filters is generally pretty trivial: you write a
>>     10-line custom Handler that pipes it to an external process.  This
>>     isn't big enough for a GSoC project.
>> 4.  To the extent that new algorithms are involved, I have doubts that
>>     Mailman mentors have the kind of expertise needed to really help
>>     with such a project (I could be wrong, but I certainly don't know
>>     much about that kind of text processing, and I don't know that
>>     anybody else in Mailman has expertise in it).
>>
>> I agree.
>
>>
>> Writing individual pipelines may be trivial, but making a user interface
>> for managing said pipelines is non-trivial.  Right now, our pipeline
>> management interface is "there's a text box in postorius that lets you
>> choose a pipeline.  It's not even a dropdown, and you may be screwed if you
>> make a typo" which is obviously not how I want it when we release. ;)
>>
>> I see a potential project timeline going something like this:
>>
>> A. make a set of custom Mailman 3 Handlers for some well-known existing
>> anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding
>> 2-4 reasonable pieces of software, setting them up, writing the handlers,
>> and testing them)
>>
>> B. make an interface in Postorius so list admins can
>> enable/disable/reorder these and any whitelisting happening within
>> mailman.  This should involve making an interface in Postorius that gives
>> admins the ability to change the Pipeline being used, and will likely
>> involve a small amount of user testing to make sure said interface doesn't
>> have risk of disastrous results if the administrator does the wrong thing.
>> (Another 3-4 weeks of work including user testing, unit tests, and
>> documentation)
>>
>> C. Figure out how to set up some sort of packager that can install
>> handlers + antispam software so that the site admin has an easy way to set
>> these up if requested. (Another 3-4 weeks of work, including testing any
>> scripts on a few different OSes and extensive documentation)
>>
>> D. If there's any time leftover, implement some clever new filter (and
>> appropriate Handler) that makes use of the list information itself (e.g.
>> subscriber list, archives, etc.) to make better spam decisions. (at this
>> point, you've got maybe 2 weeks left in the GSoC timeline)
>>
>> This really looks great! Almost what I actually expected from a project
> like this.
> But, like Stephen and Barry pointed out, I am unsure as to how far this
> comes under GSoC's purview.
> 
>
>>
>> I think that constitutes enough useful-to-mailman work to justify the
>> google funds, gets us some customizable spam filtering (which as you say,
>> is a frequently requested feature), but doesn't turn us into something
>> we're not.  That's why anti-spam made this year's gsoc list even though
>> we've always said "do it in the MTA" and I'm not about to change that
>> policy in general.
>>
>> Do feel free to disagree with me, of course, Stephen. Or complain that
>> I'm using the lure of antispam to get someone solve my user interface for
>> pipelines problem, which I totally am. ;)
>>
>>  Terri
>>
>> Thanks for such a great timeline Terri. I dont have issues with this. As
> Stephen and Barry said, I even liked the idea of having a MILTER interfaced
> at LMTP level.
>
> On a overall positive note, I am quite convinced that giving the admin of
> the list with great flexible options to choose from (and as Barry pointed
> out, why should everything be exposed to the admin via Postorius?, which
> may not be of the admin's interest! ). I believe this could be make a nice
> GSoC project, but with many spam filters which people are already
> acquainted with, I am not sure how far people tend to use this feature.
>
> Also, I would like to hear more about : Boilerplate stripper AND Better
> content-filtering / handling error messages.
> Boilerplate stripping is trivial to understand. But, can anyone elaborate
> on Better content-filtering / handling error messages?
> I strongly believe that Boilerplate stripping will be a cool thing to have
> in Mailman and obviously, who would not want to welcome better
> content-filtering / error handling techniques on board?
>
>
>
> --
> *Yours Sincerely*
> *
> *
> *Mora Sreyantha Chary*
> *Computer Engineering '14*
> *National Institute of Technology Karnataka*
> *Surathkal, India 575 025*
>

-- 
*Yours Sincerely*
*
*
*Mora Sreyantha Chary*
*Computer Engineering '14*
*National Institute of Technology Karnataka*
*Surathkal, India 575 025*