[Mailman-Developers] LTMP for incoming mail

Brad Knowles brad at stop.mail-abuse.org
Thu Sep 28 18:06:29 CEST 2006


At 10:09 AM -0400 9/28/06, Barry Warsaw wrote:

>>  Does it have to be GPL?  Is a Berkeley-type license not okay?
>
>  GPL would be best, but Berkeley is probably okay.  We'd probably
>  want to get confirmation of that from the FSF.  The key thing is
>  that it has to be compatible with the GPL (and the Python
>  Software LIcense -- see below) so that we can combine the whole
>  kit and kaboodle.

Is there any license questions or issues that we would need to have 
answered or confirmed by the Sendmail Consortium?  Or should we wait 
on that until we've heard back from the FSF?

>>  Dunno about doing it in Python, but I will say that going
>>  to Maildir as an additional queue-on-disk mechanism on top of
>>  everything else we're already doing seems to be a big step
>>  backward in terms of potential performance issues and I don't
>>  really see any significant positive benefit.
>
>  I don't think it's an additional queue-on-disk mechanism,
>  certainly in comparison to what we're doing today.

Maildir was not designed as an efficient queue-on-disk strategy.  It 
was designed to allow multiple simultaneous parallel deliveries to 
the NFS-mounted mailbox of a given user, and we know that it does a 
number of additional unnecessary things that seriously hurt its 
performance even in that relatively tightly defined context.

It does unnecessary file renames (which cause additional synchronous 
meta-data filesystem operations), it uses filenames that are too long 
and bust iname/inode caching schemes, and it doesn't make use of 
obvious significant performance-enhancing mechanisms like directory 
hashing.

It's pretty easy to design a mechanism that is much more efficient -- 
and scalable -- in handling multiple simultaneous deliveries to a 
user mailbox on NFS.


So why would we want to abuse a bad scheme for user-mailbox-on-NFS as 
an alternative scheme for queue-on-disk?

If we have queue-on-disk problems, why not solve them by implementing 
a more efficient queue-on-disk scheme, instead of abusing a poorly 
designed user-mailbox-on-NFS scheme?

>                                                That way,
>  you're not dumping all message destined for Mailman into
>  one directory.  Not as good as directory hashing, but
>  better than what we have today.

That would be somewhat of an improvement in some respects, but 
Maildir also brings along a lot of additional baggage and I'm not at 
all convinced that it's worth the effort.

>  I'll grant you that LMTP delivery has the potential to be
>  the most efficient mechanism by which messages get from
>  the MTA into Mailman.  But it's certainly more work and
>  more complicated than maildir;  will you grant that maildir
>  is better than what we have today?  Think of it as a
>  waystation on the road to the ultimate uber-performing
>  list server. :)

I'm not at all convinced that Maildir would be an overall improvement 
over what we have today.  I think that adding a directory hashing 
scheme on a fork()/exec() model would probably be a bigger 
improvement than changing our inbox delivery mechanism from a 
fork()/exec() model and using Maildir instead.

At least by sticking with fork()/exec() and adding a directory 
hashing scheme on top of that, we wouldn't need to make any changes 
to the way we interface with MTAs today -- all the changes could be 
kept completely internal to Mailman.  If we were to switch to Maildir 
as an inbox delivery method, not only would we have to change the way 
we interface with MTAs, we would also have to make internal changes 
to Mailman to support the use of Maildir as our queue-on-disk 
mechanism.  That's a bigger overall change with bigger risk and 
relatively lower potential payoff.


If we were to work on implementing a directory hashing scheme instead 
of working on Maildir, we could still add LMTP at a later date.

That would allow us to go back at a later time and enhance our 
features that we provide to Mailing list administrators, while also 
giving us time to look more deeply into the potential performance 
issues and make sure that we're not causing more problems than we're 
solving.

>  Let me just say that ideally, I think LMTP would be a
>  great way to go.  It's not my top priority though.  I'm
>  looking for ways to get more developers involved in the
>  project, and this seems like a perfect thing for someone
>  seeking Mailman fame and fortune <wink>.

I'm not convinced that this is an improvement.

>  So, anyone care to take the challenge?

I'm not a developer, but I do have experience with building 
large-scale mail and mailing list systems, and if you're willing to 
listen to me then I'm willing to give you the benefit of my 
experience.


IMO, Maildir is a Red Herring.  The one and only reason to ever 
consider using Maildir is if you're implementing a large-scale IMAP 
mail server system and you're required to store user mailboxes on NFS.

Even then, you'd be well-served to look for better storage 
mechanisms, because throwing potentially hundreds of thousands of 
messages into a single directory is guaranteed to cause huge 
performance issues, even if every single mailbox operation didn't 
involve scanning the entire directory and doing a stat() on every 
single file, locking the entire directory, creating/renaming/deleting 
the file(s) as appropriate, and then unlocking the directory.


I think we're better off spending our resources working on trying to 
resolve the real bottleneck issues that we already know are present 
in our system as opposed to working on cool stuff that may or may not 
help but would require more overall changes to more parts of the 
system and with relatively lower potential payoff.

-- 
Brad Knowles, <brad at stop.mail-abuse.org>

"Those who would give up essential Liberty, to purchase a little
temporary Safety, deserve neither Liberty nor Safety."

     -- Benjamin Franklin (1706-1790), reply of the Pennsylvania
     Assembly to the Governor, November 11, 1755

  Founding Individual Sponsor of LOPSA.  See <http://www.lopsa.org/>.


More information about the Mailman-Developers mailing list