[Mailman-Developers] LTMP for incoming mail

Thu Sep 28 22:59:39 CEST 2006

At 2:40 PM -0400 9/28/06, Barry Warsaw wrote:

>  I would ask them if their license is GPL compatible.  IOW, do they
>  believe we can combine GPL code with theirs?  Better yet would be
>  cases where that's actually been done before.

I'll send a note and ask.

>  Remember, this discussion all started because Postfix virtual host
>  delivery is broken on the trunk.  The virtual_mailbox_maps feature
>  is a new one since we last looked at how to integrate Mailman and
>  Postfix.

But this is a pure postfix issue, and now we're talking about making 
potentially large architectural changes to the system to support this 
one MTA, and without necessarily giving consideration to whether that 
buys us anything for any of the other MTAs.

I understand that integration with postfix is sub-optimal today, but 
I'm not convinced that it makes sense to seriously consider an option 
that may result in throwing out all the other MTAs, in order to fix 
things with postfix.  Worse yet would be trying to maintain two 
different systems, one for postfix and one for everyone else.  Or 
making architectural changes to support the new stuff for postfix, 
which may hurt us for other MTAs.

At the very least, I think it makes sense to look at the overall 
cost/benefit ratio.

Let's assume that we have two systems that are otherwise identical, 
with roughly equivalent traffic.  System A has a single big list, 
while system B has a number of smaller lists, but the overall 
aggregate traffic is equal.

With a Maildir solution, system A will see no benefit to the inbound 
queueing, because you're going to get the same level of contention 
within a single inbound directory for the one big list as you would 
for a single inbound directory for all lists on the system.  System B 
would get a benefit, since each list would not be competing for 
immediate synchronous meta-data update resources with the other 
lists, although there would still be some intra-list competition.

With a hashed directory solution, both systems would see the same 
level of benefit, and intra-list competition would be no worse than 
inter-list competition.  And if the competition were to get too high, 
you simply increase the level of directory hashing.

With a Maildir solution, you give up your ability to implement a 
hashed directory solution, because the MTA would no longer know how 
to write messages to your hashed mailbox-directory-per-list, and to 
get around that you'd have to have some sort of customized local 
delivery agent no matter what.

With a hashed directory solution, if necessary or desirable you could 
still implement a separate directory tree per list within your 
customized local delivery agent, and that directory tree per list 
could look however you want.

Moreover, a Maildir mailbox-per-list solution doesn't do anything for 
outbound queues, whereas a properly implemented hashed directory 
solution should affect outbound at least as much as inbound, at no 
additional implementation cost.

>  virtual_mailbox_maps don't appear to be useful for
>  delivery-to-program, and delivery-to-program with Postfix
>  virtual domains as we're doing them now has important
>  disadvantages, most notably that we have to create both an
>  alias and a virtual recipient, and we have to encode the
>  domain name such that it's a valid alias, without
>  introducing additional collisions.  That's icky.

I know that our current solution is sub-optimal, but I'm not 
convinced that it's the only way to skin this cat.  Moreover, I'm 
also not convinced that Maildir is the only effective way to make use 
of virtual_mailbox_maps.

I am pretty much convinced that using Maildir will effectively 
preclude the ability to make use of directory hashing, precisely 
because you're letting the MTA write directly to a poor standard 
interface instead of handling the internal issues in a manner that is 
opaque to the MTA.

>  I don't think there's anyway to really know without
>  implementing it and doing some measurements.  Since we
>  won't be losing delivery-to-program, that would be possible.

True enough, but there is a cost in terms of lost opportunity, and 
pushing out the delivery schedule by long enough to determine which 
method is going to work better overall.

>  Nope, we simply have to implement a MaildirRunner to pull
>  messages out of queue/in using the directory layout format
>  we decide on.

With Maildir, you don't have any choice in what the directory layout 
will look like.  That's standardized within the Maildir 
implementation, and you can't change that.  Otherwise, you wouldn't 
be using Maildir anymore, you'd be using 
mailbox-directory-solution-that-looks-kinda-semi-sorta-like-maildir-but-modified-by-Mailman-and-incompatible-with-everything-else.

>  A directory hashing scheme is orthogonal to a maildir
>  based queue/in scheme.

I'm not convinced of that.  In fact, I'm convinced that they are 
pretty much mutually exclusive.  That is, unless you're talking about 
using Maildir as a second level of queue-on-disk, before you get to 
the Mailman-internal queue-on-disk mechanism.

Now, if you are talking about two levels of queue-on-disk so that we 
can get both Maildir and queue directory hashing, I think that's 
going to be much, much worse than sticking with the existing postfix 
virtual domain solution.

>  Or someone running a huge site that would really benefit
>  from LMTP could funnel a portion of their profits into
>  paying us to add it <wink>.

I don't think we're doing enough traffic on python.org for them to 
justify paying for it.  I don't think that Apple is doing enough 
traffic with Mailman for them to justify paying for it -- not with 
what we've heard about how the new(er) MacOS X hardware is 
performing, and especially not with the total lack of any support (or 
even acknowledgement) that we get from the corporate types.

I don't think that any of the open-source projects (like FreeBSD) are 
going to be in a position to pay for something like this, or to 
develop & contribute the necessary code, although they might be doing 
enough traffic that they could certainly use these features if they 
were available.

I think that only leaves us with a site like SourceForge, and I think 
you've probably got better contacts there than any of the rest of us.

>  Absolutely.  But getting LMTP support into Mailman will
>  still require a developer to step up and write code.

I'm not that concerned about LMTP.  I think that's a big enough issue 
that we can leave that alone for now.

>  Maybe Tokio or Mark can be convinced, or maybe there's
>  another developer lurking out there who would be
>  interested.  I just want to unbreak Postfix virtual
>  domains and then fry our bigger fish.

I would like to see them unbroken, but I also don't want to see 
anything done that would preclude the use of hashed queue 
directories, or that would add a second level of queue-on-disk and 
yet another source of potential bottlenecks.

>  I think the thing you're missing is that we need to get
>  the messages from the MTA into Mailman's incoming queue
>  /somehow/, and we're basically limited by what the
>  various MTAs have to offer.

I certainly was not understanding your point that you wanted to use 
this as a way to unbreak postfix virtual domains, no.

No, I didn't get that at all.

I'm still not convinced that this is the best way to unbreak postfix 
virtual domains, however.

>  Fixing Postfix virtual domain integration is a real problem that
>  needs solving, which is how this whole thread started.

Agreed, this is a real problem that needs to be resolved.

-- 
Brad Knowles, <brad at stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

  Founding Individual Sponsor of LOPSA.  See <http://www.lopsa.org/>.