[Mailman-Developers] GSoC 2013 - GNU Mailman - Introduction and Project Discussion

Sreyanth sreyanth at gmail.com
Fri Apr 12 17:22:23 CEST 2013


And this is my another idea, which I am interested to work on

4. My own project idea: Mining the list logs and recognize interesting
patterns for better enhancements (the admin need not have data mining
experience)

​We can actually have this integrated to the admin console where the logs
can be accessed, at the same time, some interesting patterns can shown,
along with stats and all (just a basic idea, need to work on this more).
Depending on the detected patterns, the admin may want to change some
settings! Given my experience with IR and Django, I feel this is a
potential GSoC project!

Any suggestions?​



On Fri, Apr 12, 2013 at 8:47 PM, Sreyanth <sreyanth at gmail.com> wrote:

> Hi all! Thank you very much for awesome discussion here!
>
> On Fri, Apr 12, 2013 at 1:22 AM, Terri Oda <terri at zone12.com> wrote:
>
>>  On 13-04-11 10:44 AM, Stephen J. Turnbull wrote:
>>
>> 1.  Mailman is the wrong place to do filtering.  It's equally
>>     effective, normally covers more messages, and is somewhat more
>>     efficient in resource usage to do it at the MTA.
>> 2.  Any new algorithms **should** be made available at the MTA level
>>     where they can be best put to use by more people.  This implies
>>     something that either plugs into existing filters (such as
>>     spamassassin) or MTAs (ie, milters) rather than a Handler.
>> 3.  Adapting existing filters is generally pretty trivial: you write a
>>     10-line custom Handler that pipes it to an external process.  This
>>     isn't big enough for a GSoC project.
>> 4.  To the extent that new algorithms are involved, I have doubts that
>>     Mailman mentors have the kind of expertise needed to really help
>>     with such a project (I could be wrong, but I certainly don't know
>>     much about that kind of text processing, and I don't know that
>>     anybody else in Mailman has expertise in it).
>>
>> I agree.​​
>
>>
>> Writing individual pipelines may be trivial, but making a user interface
>> for managing said pipelines is non-trivial.  Right now, our pipeline
>> management interface is "there's a text box in postorius that lets you
>> choose a pipeline.  It's not even a dropdown, and you may be screwed if you
>> make a typo" which is obviously not how I want it when we release. ;)
>>
>> I see a potential project timeline going something like this:
>>
>> A. make a set of custom Mailman 3 Handlers for some well-known existing
>> anti-spam/anti-malware software.  (Maybe 2-3 weeks of work here, finding
>> 2-4 reasonable pieces of software, setting them up, writing the handlers,
>> and testing them)
>>
>> B. make an interface in Postorius so list admins can
>> enable/disable/reorder these and any whitelisting happening within
>> mailman.  This should involve making an interface in Postorius that gives
>> admins the ability to change the Pipeline being used, and will likely
>> involve a small amount of user testing to make sure said interface doesn't
>> have risk of disastrous results if the administrator does the wrong thing.
>> (Another 3-4 weeks of work including user testing, unit tests, and
>> documentation)
>>
>> C. Figure out how to set up some sort of packager that can install
>> handlers + antispam software so that the site admin has an easy way to set
>> these up if requested. (Another 3-4 weeks of work, including testing any
>> scripts on a few different OSes and extensive documentation)
>>
>> D. If there's any time leftover, implement some clever new filter (and
>> appropriate Handler) that makes use of the list information itself (e.g.
>> subscriber list, archives, etc.) to make better spam decisions. (at this
>> point, you've got maybe 2 weeks left in the GSoC timeline)
>>
>> This really looks great! Almost what I actually expected from a project
> like this.
> But, like Stephen and Barry pointed out, I am unsure as to how far this
> comes under GSoC's purview.
> ​​
>
>>
>> I think that constitutes enough useful-to-mailman work to justify the
>> google funds, gets us some customizable spam filtering (which as you say,
>> is a frequently requested feature), but doesn't turn us into something
>> we're not.  That's why anti-spam made this year's gsoc list even though
>> we've always said "do it in the MTA" and I'm not about to change that
>> policy in general.
>>
>> Do feel free to disagree with me, of course, Stephen. Or complain that
>> I'm using the lure of antispam to get someone solve my user interface for
>> pipelines problem, which I totally am. ;)
>>
>>  Terri
>>
>> Thanks for such a great timeline Terri. I dont have issues with this. As
> Stephen and Barry said, I even liked the idea of having a MILTER interfaced
> at LMTP level.
>
> On a overall positive note, I am quite convinced that giving the admin of
> the list with great flexible options to choose from (and as Barry pointed
> out, why should everything be exposed to the admin via Postorius?, which
> may not be of the admin's interest! ). I believe this could be make a nice
> GSoC project, but with many spam filters which people are already
> acquainted with, I am not sure how far people tend to use this feature.
>
> Also, I would like to hear more about : Boilerplate stripper AND Better
> content-filtering / handling error messages.
> ​Boilerplate stripping is trivial to understand. But, can anyone elaborate
> on Better content-filtering / handling error messages?
> I strongly believe that Boilerplate stripping will be a cool thing to have
> in Mailman and obviously, who would not want to welcome better
> content-filtering / error handling techniques on board?​
>
>
>
> --
> *Yours Sincerely*
> *
> *
> *Mora Sreyantha Chary*
> *Computer Engineering '14*
> *National Institute of Technology Karnataka*
> *Surathkal, India 575 025*
>



-- 
*Yours Sincerely*
*
*
*Mora Sreyantha Chary*
*Computer Engineering '14*
*National Institute of Technology Karnataka*
*Surathkal, India 575 025*


More information about the Mailman-Developers mailing list