[Mailman-Developers] ANNOUNCE: GNU Mailman 3.0a1 (Leave That Thing Alone)

Fri Apr 11 15:30:26 CEST 2008

--On 10 April 2008 22:52:49 -0400 Barry Warsaw <barry at list.org> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Apr 9, 2008, at 2:42 PM, Ian Eiloart wrote:
>> --On 8 April 2008 22:10:58 -0400 Barry Warsaw <barry at list.org> wrote:
>>
>>> After far too long, I'm finally happy to announce the availability of
>>> GNU Mailman version 3.0 alpha 1, code name "Leave That Thing Alone".
>>
>> This is great news.
>>
>> I've downloaded the code, and had a look at the LMTP server code,
>> which I'm looking forward to. I'm hoping that it will permit
>> integration with Exim in a way (LMTP call forwards) that gives us
>> something sensible to do with messages from forbidden posters.
>>
>> Unfortunately, it doesn't. But, if the following seems sensible,
>> then I'm willing to implement it:
>
> Hi Ian, I'm really psyched that you've taken a look and are interested in
> fleshing this part of LMTP out.  I didn't have time for it before getting
> 3.0a1 out, but I did think about it, so let me outline how I was thinking
> of doing it.  I'm definitely open to other approaches if it makes the
> feature work better.
>
>> The LMTP server is implemented as a subclass of Python's built in
>> SMTPServer, which seems to have only one method: process_message.
>> That must be run too late to give any useful information to a
>> callforward.
>
> IIRC, this is going to be called when the "\r\n.\r\n" terminator to the
> DATA command is seen.  That might be an interesting place to hook in
> content filtering.

Yes, that's the earliest opportunity to filter content. It also allows the 
LTMP server to accept the message for some lists, but not others. It 
doesn't help the local SMTP server, though, since it's too late for an SMTP 
server to selectively reject.

Some admins may prefer this though. For example, I monitor my SMTP server 
queues more closely than my Mailman queues.

>> However, on closer inspection, there's a "channel" class which
>> inherits from smtpd.SMTPChannel. This seems to override some useful
>> looking methods:
>>
>>   def smtp_LHLO(self, arg):
>>   def smtp_HELO(self, arg):
>>
>> Ah, and the source suggests to me that overriding smtp_RCPT is
>> feasible. That's the place to check whether the list exists (and
>> whether the sender is permitted to post - but let's do one thing at
>> a time).
>>
>> Validating the recipient
>> ------------------------
>>
>> All that's required is that the code which checks the existence of a
>> list is moved from process_message to smtp_RCPT, and it's modified
>> to "self.__rcpttos.append(address)" when the address is valid, but
>> not otherwise. Thus, we build rcpttos instead of reading from it.
>> LMTP requires a reply after DATA for only those addresses that
>> haven't been rejected at RCPT. If I'm reading the code correctly,
>> then a 550 error would be generated if the queue disappeared between
>> RCTP and DATA (by virtue of the exception being thrown by
>> queue.enqueue().
>
> We probably want to refactor the code that splits the recipient address
> into a separate method, this would validate the fqdn list name and
> subaddress, but it wouldn't instantiate a Switchboard object.

Hmm, I don't know what a Switchboard object is.

> Might as well skip that if the message isn't going to get delivered.  If
> everything looks good, store the fqdn listname and subaddress in
> self._rcpttos (no need for double underscore) and then process_message()
> can iterate over that and create the Switchboard instance.

> If the  recipient is not valid, store a marker object instead so that it 
can be
> compared against in the self._rcpttos loop in process_message().

No, I don't think that's necessary. If we don't like the argument to RCPT 
we should say so with a 4xx or 5xx error, and then add NOTHING to 
self._rcpttos. The RFC only wants us to give a reply for addresses that 
made it past RCPT. It's crucial to get this right, because the returned 
list of codes isn't indexed, and therefore has to be matched up with the 
list of RCTP commands that didn't get a rejection.

If you look at the example here: 
<http://www.apps.ietf.org/rfc/rfc2033.html#sec-4.2>, there are three RCPT 
commands, but only two post-DATA responses.

Of course, we could say yes to everything at RCTP, but then our SMTP server 
callouts are pointless.

>  When that loop sees the marker object, it appends an ERR_550 without 
bothering
> to create the Switchboard instance.
>
>> Validating the sender
>> ---------------------
>>
>> Now, as to the question of determining whether the message sender is
>> permitted to post to a list, that's a bit more complicated. As far
>> as I can see, all the logic for this is in moderate.py - in the
>> "process" function. But, the decision is all tangled up with with
>> the actions.
>>
>> I propose an additional function, to be called from process, or by
>> the lmtp server in smtp_RCPT: perhaps post_action(list, sender)
>> which returns the name of the action to take when "sender" posts to
>> the list "list". I guess it will return one of "accept", "discard",
>> "reject", or "hold", and the lmtp server should reject (not bounce)
>> the message if "reject" or "discard" is returned. It might be nice
>> to introduce a distinction between bounce and reject. For example, a
>> site admin might decide to permit bounce messages to be generated
>> for local mail domains.
>
> That's not the way I was thinking about doing it.  Although you can't
> really tell, pipeline/moderate.py is obsolete and will likely go away.
> It doesn't fit the current model where moderation occurs by rule-chain
> processing in the incoming queue, and the pipeline queue runs handlers
> which only modify already accepted messages.  I really should have moved
> moderate.,py to an attic because there are bits I still want to convert,
> but just haven't gotten to yet.
>
> Originally, I was thinking we could just define an IChain which could be
> used in the lmtp runner.  This new chain would be specifically defined to
> run a smaller set of rules that would validate the sender, or do other
> early filtering.  But there's a problem with that because the interface
> to IChain and IRule takes three arguments, the mailing list, the message,
> and the message metadata dictionary.
>
> One problem is that until you get to process_message() -- i.e. you've
> seen the DATA terminating dot), you won't have a message object, and many
> of the rules you're going to want require a message object.  You could
> get around that by defining some custom rules that ignored the 'msg'
> argument, and you could put everything you know so far in the msgdata
> dictionary.  These rules would only work then in the lmtp runner.

Or, could you pass an empty message? I guess not if a list is configured to 
reject implicit recipients, for example.

> For the mlist argument, you probably need to run the rule-chain for each
> RCPT TO.  Because you've already determined that it's a valid mailing
> list above, you get the mailing list object and use that in the rule
> chain.  It would therefore be possible to reject the message for list A
> but accept it for list B.  I think you want to return a 553 if you're
> going to reject the recipient.
>
> Upshot is that in thinking about this, I think it's doable using the
> rule-chain architecture during the smpt_RCPT call.

That's good news.

> Does the above make sense?

To an extent. I know now that I was wise to ask before ploughing on!

The first part seems quite easy to do, and has the significant gain for me 
that it allows me to put my Mailman server on a different host from my MTA. 
Currently, my MTA is looking for the existence of the Mailman files for 
routing purposes. Calling out to the LMTP server is a much more flexible 
solution.

The second part - screening senders seems harder at the moment, because I 
don't understand how the chains work. I'll take another look, but a brief 
explanation of the concept, and how they're called would be useful.

>  If you decide to work on it, I'd be happy to
> help answer ore specific questions, and of course review the code.  I'm
> not opposed to making some architectural changes to suit this better,

Hmm, well if we can't do stuff at RCPT time, then I think the architecture 
is flawed. Whether its reasonably easy to fix, I can't tell.

I think it's really important to separate out the code for determining 
whether a sender can post to a list. In my experience, it's the only test 
that (a) users really understand and (b) is really useful.

It's also important to because it helps us to decide whether we are safe to 
generate a bounce message, in the event that the message content is bad. 
I'd argue that it usually is safe when the sender is a member of the list, 
but usually not otherwise. That still leaves us with a problem for open 
lists, but no worse a problem than we already have.

> but I'd want to think about them carefully, and would prefer to fit it 
into
> the current model.  We'd also need doctest for the feature and any other
> changes you make.

Oh, I guess I'll have to do some proper learning, then!

>
> - -Barry
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.8 (Darwin)
>
> iEYEARECAAYFAkf+0oEACgkQ2YZpQepbvXH8SwCeL86BhmoIrur6Kbbapt8pMkJt
> gsoAn0v95kRIfMczhTJb9qu+Ighbha+K
> =JNvC
> -----END PGP SIGNATURE-----

-- 
Ian Eiloart
IT Services, University of Sussex
x3148