[Mailman-Developers] ANNOUNCE: GNU Mailman 3.0a1 (Leave That Thing Alone)

Barry Warsaw barry at list.org
Sat Apr 12 17:18:09 CEST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Apr 11, 2008, at 9:30 AM, Ian Eiloart wrote:

>> IIRC, this is going to be called when the "\r\n.\r\n" terminator to  
>> the
>> DATA command is seen.  That might be an interesting place to hook in
>> content filtering.
>
> Yes, that's the earliest opportunity to filter content. It also  
> allows the LTMP server to accept the message for some lists, but not  
> others. It doesn't help the local SMTP server, though, since it's  
> too late for an SMTP server to selectively reject.
>
> Some admins may prefer this though. For example, I monitor my SMTP  
> server queues more closely than my Mailman queues.

So you mean that most SMTPd's will have accepted the message by they  
time they issue the DATA to the LMTP?  If that's the case, then does  
it even make sense to do content filtering at that step in LMTP, given  
that we'll still be doing the full-blown rule matching in the incoming  
runner?

>>> However, on closer inspection, there's a "channel" class which
>>> inherits from smtpd.SMTPChannel. This seems to override some useful
>>> looking methods:
>>>
>>>  def smtp_LHLO(self, arg):
>>>  def smtp_HELO(self, arg):
>>>
>>> Ah, and the source suggests to me that overriding smtp_RCPT is
>>> feasible. That's the place to check whether the list exists (and
>>> whether the sender is permitted to post - but let's do one thing at
>>> a time).
>>>
>>> Validating the recipient
>>> ------------------------
>>>
>>> All that's required is that the code which checks the existence of a
>>> list is moved from process_message to smtp_RCPT, and it's modified
>>> to "self.__rcpttos.append(address)" when the address is valid, but
>>> not otherwise. Thus, we build rcpttos instead of reading from it.
>>> LMTP requires a reply after DATA for only those addresses that
>>> haven't been rejected at RCPT. If I'm reading the code correctly,
>>> then a 550 error would be generated if the queue disappeared between
>>> RCTP and DATA (by virtue of the exception being thrown by
>>> queue.enqueue().
>>
>> We probably want to refactor the code that splits the recipient  
>> address
>> into a separate method, this would validate the fqdn list name and
>> subaddress, but it wouldn't instantiate a Switchboard object.
>
> Hmm, I don't know what a Switchboard object is.

The Switchboard object represents a queue's directory.  It's the class  
that handles enqueuing and dequeuing message pickles from that  
directory.

>> Might as well skip that if the message isn't going to get  
>> delivered.  If
>> everything looks good, store the fqdn listname and subaddress in
>> self._rcpttos (no need for double underscore) and then  
>> process_message()
>> can iterate over that and create the Switchboard instance.
>
>> If the  recipient is not valid, store a marker object instead so  
>> that it
> can be
>> compared against in the self._rcpttos loop in process_message().
>
> No, I don't think that's necessary. If we don't like the argument to  
> RCPT we should say so with a 4xx or 5xx error, and then add NOTHING  
> to self._rcpttos.

Yep.  I wasn't thinking clearly when I wrote the above.  You're  
absolutely right.

> The RFC only wants us to give a reply for addresses that made it  
> past RCPT. It's crucial to get this right, because the returned list  
> of codes isn't indexed, and therefore has to be matched up with the  
> list of RCTP commands that didn't get a rejection.
>
> If you look at the example here: <http://www.apps.ietf.org/rfc/rfc2033.html#sec-4.2 
> >, there are three RCPT commands, but only two post-DATA responses.
>
> Of course, we could say yes to everything at RCTP, but then our SMTP  
> server callouts are pointless.
>> One problem is that until you get to process_message() -- i.e. you've
>> seen the DATA terminating dot), you won't have a message object,  
>> and many
>> of the rules you're going to want require a message object.  You  
>> could
>> get around that by defining some custom rules that ignored the 'msg'
>> argument, and you could put everything you know so far in the msgdata
>> dictionary.  These rules would only work then in the lmtp runner.
>
> Or, could you pass an empty message? I guess not if a list is  
> configured to reject implicit recipients, for example.

Actually, this occurred to me to, and I do think it would work.  You  
have limited information at that point, but enough to create a bare- 
bones Message instance that can satisfy the interface.  You just have  
to be careful not to put any rules into that chain that require  
information you don't have at that point, but that should be fairly  
easy to do.

>> For the mlist argument, you probably need to run the rule-chain for  
>> each
>> RCPT TO.  Because you've already determined that it's a valid mailing
>> list above, you get the mailing list object and use that in the rule
>> chain.  It would therefore be possible to reject the message for  
>> list A
>> but accept it for list B.  I think you want to return a 553 if you're
>> going to reject the recipient.
>>
>> Upshot is that in thinking about this, I think it's doable using the
>> rule-chain architecture during the smpt_RCPT call.
>
> That's good news.
>
>> Does the above make sense?
>
> To an extent. I know now that I was wise to ask before ploughing on!
>
> The first part seems quite easy to do, and has the significant gain  
> for me that it allows me to put my Mailman server on a different  
> host from my MTA. Currently, my MTA is looking for the existence of  
> the Mailman files for routing purposes. Calling out to the LMTP  
> server is a much more flexible solution.
>
> The second part - screening senders seems harder at the moment,  
> because I don't understand how the chains work. I'll take another  
> look, but a brief explanation of the concept, and how they're called  
> would be useful.

All this is in the doctests (that's the beauty of testable  
documentation! :), but briefly:

You have rules, which have a check() method.  This method takes a  
mailing list, a message, and a metadata dictionary and it returns a  
boolean specifying whether the rule matched or not.  Rules are  
organized into chains, where each element of the chain is actually a  
"chain link".  The chain link ties a rule with an action.

Processing a chain is then just iterating over all the chain links,  
executing the rule in the link, and if the rule returns True,  
processing the action.  The actions are things like "jump to another  
chain", "take a detour through another chain", "stop processing this  
chain", "run some function".

So the idea is that we'll define a rule (or rules -- they should be as  
narrow as possible) that implements the "verify if this RCPT TO is  
acceptable", and an action that has a callable that says "set the RCPT  
TO return code", which is probably stored in the metadata dictionary   
There's probably a built-in such chain, and it may be YAGNI to allow  
per-list overrides.

So, you get the RCPT TO, you process the chain, grab the return code  
out of the metadata dictionary.  If it's a 250, you remember the  
recipient for later, and if not, you chuck it.

>> If you decide to work on it, I'd be happy to
>> help answer ore specific questions, and of course review the code.   
>> I'm
>> not opposed to making some architectural changes to suit this better,
>
> Hmm, well if we can't do stuff at RCPT time, then I think the  
> architecture is flawed. Whether its reasonably easy to fix, I can't  
> tell.

See above, I think we can.

> I think it's really important to separate out the code for  
> determining whether a sender can post to a list. In my experience,  
> it's the only test that (a) users really understand and (b) is  
> really useful.

It probably does make sense to refactor this out.  I can imagine the  
REST interface is going to want to ask the same question.

> It's also important to because it helps us to decide whether we are  
> safe to generate a bounce message, in the event that the message  
> content is bad. I'd argue that it usually is safe when the sender is  
> a member of the list, but usually not otherwise. That still leaves  
> us with a problem for open lists, but no worse a problem than we  
> already have.

There's something else that Mailman 3 allows us to do.  We can ask  
whether we know anything about the sender or not.  So she may not be a  
member of List A, but maybe she's a member of List B.  So when she  
posts to List A, we at least know that the sender is a real person.   
Maybe we send her a bounce, maybe we don't, but I think knowing that  
the site has a relationship with the sender can come in handy.

>> but I'd want to think about them carefully, and would prefer to fit  
>> it
> into
>> the current model.  We'd also need doctest for the feature and any  
>> other
>> changes you make.
>
> Oh, I guess I'll have to do some proper learning, then!

I hope the above helps!  Let me know if any the doctests don't make  
sense as documentation.

Good luck!
- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.8 (Darwin)

iEYEARECAAYFAkgA0rQACgkQ2YZpQepbvXEbPQCeNBockJ6jmb9nugeGxT3t6017
Oo0An0SnGNW8Sn4+O9yfbZ9NGPO/sRFh
=ljz2
-----END PGP SIGNATURE-----


More information about the Mailman-Developers mailing list