[Mailman-Developers] GUI hacking for SA integration

Sun Nov 30 17:05:08 EST 2003

On Thu, 2003-11-27 at 11:42, J C Lawrence wrote:

> First thought:
> 
>   Discard message on <regex>
>   Hold message on <regex>
>   Accept message on <regex>
> 
> Where the regex fields are in fact lists of regexes.  Given such support
> it is fairly simple to define regexes which match SpamAssassin headers
> of value X and above/below for the relevant entries, or SpamBayes or
> whatever.

Here are some strawman ideas for y'all to knock down. :)

I'm thinking for the next version that we want to be able to set up a
list of rules which can be added, removed, or moved up or down.  Each
rule has a condition and an action.  As a model, think something like
Evolution's filter u/i.

Everything that's currently defined in Hold.py and Moderate.py would
instead define a rule, such as "If matches regular expressions", or "If
moderated member", or "if size greater than".  The actions would by
default include such things as reject, hold, discard, but it could be
other things like 'scrub', e.g. "if contains text/html, scrub".  There
would also be actions like "stop processing", or "save to folder".

Let me blue sky a little bit. ;)  To address Simone's suggestion of
saving a pristine copy, and to support Spambayes training, I'm thinking
that if Mailman had a (limited) IMAP interface, we could use special
folders to provide functionality and instruct Mailman on actions to
take.  Imagine that each list had the following IMAP folders:

      * Pristine - unadulterated copies of the message as Mailman
        received them from the MTA.
      * Held - Messages held for moderator approval
      * Approved - Moving messages from Held to Approved would instruct
        Mailman to allow the message to pass to the list
      * Spam - messages matching a spam score
      * Unsure - message that are not quite ham, not quite spam
      * Preserve - messages the moderator wants to preserve for future
        reference.
      * SpamTrain, HamTrain - messages used to train (or retrain) a
        bayesian type classifier

Maybe there's more, maybe different names.  The point is that messages
end up in the folders because of rule actions, but some folders are
scanned by Mailman to take actions on the messages.

Now, we'd have to protect these folders by only allowing admin or
moderator login (probably over imaps if available), and there'd be some
size or time limit on the messages in the folders.  E.g. you couldn't
preserve more than xMB of messages, or spam would get automatically
discarded after x days.

The other important point is that you would use a real mail reader to
access this information (and no, we wouldn't do POP).  I'm strongly of
the opinion that I really want to deal with messages uses existing tools
that already know how to deal with email.  Lots of interesting questions
then -- do we provide IMAP access for non-admins?  (I think not, that's
what NNTP access is for).  Also, for admins who just can't use IMAP,
we'd need to provide /some/ web interface for them to deal with things. 
I suspect it'll be clunky no matter how much Javascript we throw at it
<wink>, but thems the breaks.

All this is pie-in-the-sky and doesn't much help Jeff for the 2.1 (or
even 2.x <wink>) branch.  So practically speaking, I'd keep it simple. 
Work off the cvs head since even the following would have to be a MM2.2
feature: expand the existing Privacy->Spam Filters page to take a list
of regular expressions, with an action for each one.  As a model, use
the Topics widgets idea, which always adds a new blank one when you
submit a new topic filter.  You'd probably also want to add Up and Down
buttons.

-Barry