[Mailman-Developers] Looking at performance again

M.-A. Lemburg mal at lemburg.com
Wed May 14 19:41:09 EDT 2003


Barry Warsaw wrote:
> - What kind of a hit does the memberdb-in-a-pickle take?  Would things
>   go faster if we stored the member data in a Berkeley, MySQL, or
>   other real database?  I'd like to do some testing with my BDB member
>   code and I'm wondering if the folks working on other member adapters
>   have any performance feedback.

Apart from the fact that you'll have tough times when loading
100k members into memory, the use of a real database is a
tradeoff:

Mailman has lots of code which is built in a way that assumes
membership data access is fast. With a few 1000 members, things
can still fit nicely into memory, so that's a valid assumption.
However, with a few 100k members you really want to be more
careful and only load data in chunks into memory.

Mailman needs to be redesigned in a couple of places for that
to work. One example is the membership admin interface, another
is personalization.

So, no, you don't gain performance by putting small lists into
a database and, yes, if your list size grows beyond certain limits,
there's simply no alternative. Once your machine starts swapping,
the database + chunking approach is faster.

> - Do we win or lose with the process model, as compared to say, a
>   threading model?  I've been wondering if our fears of the Python GIL
>   are unfounded.  We could certainly reduce memory overhead by
>   multi-threading, and we might be able to leverage something like
>   Twisted, which is still in the back of my mind as a very cool way to
>   get multi-protocol support into Mailman.

I don't think that multi-threading would gain any performance.
It would make more sense to have Mailman use multiple SMTP
backends for delivery (MTA clustering).

BTW, is Mailman thread-safe ?

> - Does our "NFS-safe" locks impose too much complexity and overhead to
>   be worth it?  Does anybody actually /use/ Mailman over NFS?  Don't
>   we sorta suspect the LockFile implementation anyway?  Would we be
>   better off using kernel file locks, or thread locks if we go to a MT
>   model?

You may want to have a look at mx.Misc.FileLock (in egenix-mx-base).
That's a portable file locking mechanism which is not the
fastest, but fast enough for most cases.

> Okay, now I'm rambling.  What is the lowest hanging fruit that we might
> be able to attack?  I'm up for any other ideas people have.

MTA clustering support. Basically just do round-robin
delivery to a list of SMTP hosts.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Software directly from the Source  (#1, May 14 2003)
 >>> Python/Zope Products & Consulting ...         http://www.egenix.com/
 >>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
EuroPython 2003, Charleroi, Belgium:                        41 days left




More information about the Mailman-Developers mailing list