[Mailman-Developers] LTMP for incoming mail

Barry Warsaw barry at python.org
Thu Sep 28 20:40:51 CEST 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sep 28, 2006, at 12:06 PM, Brad Knowles wrote:

> Is there any license questions or issues that we would need to have  
> answered or confirmed by the Sendmail Consortium?  Or should we  
> wait on that until we've heard back from the FSF?

I would ask them if their license is GPL compatible.  IOW, do they  
believe we can combine GPL code with theirs?  Better yet would be  
cases where that's actually been done before.

> Maildir was not designed as an efficient queue-on-disk strategy.   
> It was designed to allow multiple simultaneous parallel deliveries  
> to the NFS-mounted mailbox of a given user, and we know that it  
> does a number of additional unnecessary things that seriously hurt  
> its performance even in that relatively tightly defined context.
>
> It does unnecessary file renames (which cause additional  
> synchronous meta-data filesystem operations), it uses filenames  
> that are too long and bust iname/inode caching schemes, and it  
> doesn't make use of obvious significant performance-enhancing  
> mechanisms like directory hashing.
>
> It's pretty easy to design a mechanism that is much more efficient  
> -- and scalable -- in handling multiple simultaneous deliveries to  
> a user mailbox on NFS.
>
> So why would we want to abuse a bad scheme for user-mailbox-on-NFS  
> as an alternative scheme for queue-on-disk?
>
> If we have queue-on-disk problems, why not solve them by  
> implementing a more efficient queue-on-disk scheme, instead of  
> abusing a poorly designed user-mailbox-on-NFS scheme?

Remember, this discussion all started because Postfix virtual host  
delivery is broken on the trunk.  The virtual_mailbox_maps feature is  
a new one since we last looked at how to integrate Mailman and  
Postfix.  What looks appealing about this is that we can actually pre- 
sort message based on recipient address, and in fact we could pre- 
sort by domain, list name, and list alias.  This really has a big  
advantage in that Mailman's incoming runner can do less message  
inspection to determine where that message is supposed to go.  If a  
file came from /usr/local/mailman/queue/in/mydomain/mylist/post, we  
know immediately that it's destined for list members.  Etc.  We also  
get a layout with fewer messages in more subdirectories for free.

virtual_mailbox_maps don't appear to be useful for delivery-to- 
program, and delivery-to-program with Postfix virtual domains as  
we're doing them now has important disadvantages, most notably that  
we have to create both an alias and a virtual recipient, and we have  
to encode the domain name such that it's a valid alias, without  
introducing additional collisions.  That's icky.

I think John was asking about using virtual_mailbox_maps with  
delivery to mbox, but I think that's worse, because mbox delivery  
forces you to implement locking to avoid contention on the shared  
file.  So if we're going to utilize virtual_mailbox_maps I think  
we're stuck with a maildir layout in queues/in.

>>  I'll grant you that LMTP delivery has the potential to be
>>  the most efficient mechanism by which messages get from
>>  the MTA into Mailman.  But it's certainly more work and
>>  more complicated than maildir;  will you grant that maildir
>>  is better than what we have today?  Think of it as a
>>  waystation on the road to the ultimate uber-performing
>>  list server. :)
>
> I'm not at all convinced that Maildir would be an overall  
> improvement over what we have today.  I think that adding a  
> directory hashing scheme on a fork()/exec() model would probably be  
> a bigger improvement than changing our inbox delivery mechanism  
> from a fork()/exec() model and using Maildir instead.

I don't think there's anyway to really know without implementing it  
and doing some measurements.  Since we won't be losing delivery-to- 
program, that would be possible.

> At least by sticking with fork()/exec() and adding a directory  
> hashing scheme on top of that, we wouldn't need to make any changes  
> to the way we interface with MTAs today -- all the changes could be  
> kept completely internal to Mailman.  If we were to switch to  
> Maildir as an inbox delivery method, not only would we have to  
> change the way we interface with MTAs, we would also have to make  
> internal changes to Mailman to support the use of Maildir as our  
> queue-on-disk mechanism.  That's a bigger overall change with  
> bigger risk and relatively lower potential payoff.

Nope, we simply have to implement a MaildirRunner to pull messages  
out of queue/in using the directory layout format we decide on.  We  
have to do something anyway because the current Postfix integration  
method for virtual domains is broken, and I think the fix is uglier  
and more error prone that switching to a different integration  
method.  I have no problem continuing to maintain delivery-to-program  
for other MTAs, or even Postfix where there's only a single domain.

> If we were to work on implementing a directory hashing scheme  
> instead of working on Maildir, we could still add LMTP at a later  
> date.

A directory hashing scheme is orthogonal to a maildir based queue/in  
scheme.  We should definitely do the former because it buys us  
advantages for the other queues.  We could definitely do LMTP later.   
Or someone running a huge site that would really benefit from LMTP  
could funnel a portion of their profits into paying us to add it <wink>.

>>  Let me just say that ideally, I think LMTP would be a
>>  great way to go.  It's not my top priority though.  I'm
>>  looking for ways to get more developers involved in the
>>  project, and this seems like a perfect thing for someone
>>  seeking Mailman fame and fortune <wink>.
>
> I'm not convinced that this is an improvement.

Was that a comment on the preceding paragraph? :)

>>  So, anyone care to take the challenge?
>
> I'm not a developer, but I do have experience with building large- 
> scale mail and mailing list systems, and if you're willing to  
> listen to me then I'm willing to give you the benefit of my  
> experience.

Absolutely.  But getting LMTP support into Mailman will still require  
a developer to step up and write code.  Maybe Tokio or Mark can be  
convinced, or maybe there's another developer lurking out there who  
would be interested.  I just want to unbreak Postfix virtual domains  
and then fry our bigger fish.

> IMO, Maildir is a Red Herring.  The one and only reason to ever  
> consider using Maildir is if you're implementing a large-scale IMAP  
> mail server system and you're required to store user mailboxes on NFS.

I think the thing you're missing is that we need to get the messages  
from the MTA into Mailman's incoming queue /somehow/, and we're  
basically limited by what the various MTAs have to offer.  This is  
primarily an integration issue, so it's necessarily MTA-specific,  
even if we were to do nothing and stick with delivery-to-program.  We  
can -- and must -- do better with Postfix virtual domains, and as I  
see it,  using virtual_mailbox_maps with maildir delivery is the best  
option available.  I'm still open to other suggestions but as yet, I  
don't see a better way.

BTW, all this discussion of Postfix integration should not make Exim,  
Sendmail, or qmail users feel left out!  If there are betters ways to  
get mail from those MTAs into Mailman's incoming queue, I'm all for  
improving those integration points too.  I just need guidance from  
those more knowledgeable with those MTAs as to what changes we should  
make, if any.  We're not playing favorites, and we're not going to  
make any design choices that would improve Postfix integration at the  
expense of other MTAs.

> I think we're better off spending our resources working on trying  
> to resolve the real bottleneck issues that we already know are  
> present in our system as opposed to working on cool stuff that may  
> or may not help but would require more overall changes to more  
> parts of the system and with relatively lower potential payoff.

Fixing Postfix virtual domain integration is a real problem that  
needs solving, which is how this whole thread started.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iQCVAwUBRRwXOXEjvBPtnXfVAQIYogP/R0+WjnzoYylVdWR9779e9Giht6euldTQ
OjRYXw1IkLGoZOgbXCQF9UvUASw+3NGKVj5nRGKPVBaXOqAZZCYuQHkSTa0ZsIe/
oRBMtbYokHGxV9DFz5g7b6aoSLaHW8u0ieMdk1uvxcrVveVt8jjxD9IifDvhXYBV
V3HYgOrg7Dg=
=pdD3
-----END PGP SIGNATURE-----


More information about the Mailman-Developers mailing list