[Mailman-Developers] Huge lists

J C Lawrence claw@kanga.nu
Thu, 25 May 2000 00:05:38 -0700


On Wed, 24 May 2000 22:13:08 -0700 
Chuq Von Rospach <chuqui@plaidworks.com> wrote:

> At 9:52 PM -0700 5/24/2000, J C Lawrence wrote:
>> Umm, true.  Looking at it again, and doing a quick check of my
>> user base's MXing, I suspect we're dealing with a less than 1%
>> gain.  Bigger fish are available.  Methinks my brain was farting.

> Nope, just getting a bit ahead and thinking of something that's
> fun and technically challenging. Been there, done that... (grin).

Well, yes.

>> I don't believe that a list server has any business handling MX
>> sorting unless it is also taking responsibility for being the
>> list MTA.  As Mailman isn't, its a moot point.

> And that's an issue I've been wrestling with a lot -- do I do a
> specialized MTA? Or do I let the MTA do its job. After going back
> and forth on this for weeks, given my current delivery rates, I've
> decided to let the MTA do its job, and wait on writing a
> specialized MTA until I need that last couple of percent of
> performance. Moving to 8.10.1 seems like an easy performance bump,
> postfix looks like it'll buy me even more, and so while doing all
> the MXing and stuff would be fun, it can wait.

I'm a big believer in being lazy.  I don't like handling problems
that other people seem to have done a decent enough job already
unless they're *really* interesting.  There are already enough
interesting things out there to do to keep me busy, so that doesn't
happen often.

Ergo, if the MTA looks like it can do it, can be persuaded to do it,
and that means I really don't have to, then well, who am I to argue?
QMail, with which I'm becoming uncomfortably familiar, is bloody
fast.  I hope/expect that Postfix is very similar.  Its tough to
image a situation where my time and effort in replacing them (as a
solo effort) would actually be worth it as versus throwing hardware
at the problem or chatting up Wietse & co.  I've written list
servers and mini-MTAs before.  There's a fair bit of hidden
complexity and brain hurt in there I don't mind avoiding.

> queue management is another issue. that's one place majordomo is
> weak at, because it doesn't. Everything is delivered as it comes
> in, so bursts can take a system to its knees.

True.  Were Mailman asynchronous, a pattern as below would seem
useful:

  There is never more a single "queued message handler" process
  (maybe multi-threaded, or not).  That process guarantees not to
  feed messages to the MTA any faster than XXX messages per
  second/minute, and to stop such feeding were system load to rise
  above ZZZ.  The single instance rule prevents multiple handler
  processes for multiple mailing lists maxxing out the MTA as they
  all dump simultaneously.  The problem of multiple list servers
  (boxes) dumping simultaneously to a remote MTA is properly, I
  believe, outside of Mailman's purview.

I don't see a value in trying to monitor MTA queue size.  Too MTA
specific.  Monitoring list server queue size and implementing
fairness algorithms in emptying the queue across multiple lists is
worth looking at tho (arguably round-robin is "good enough" and is
very simple to do, but it ain't pretty when things get ugly).

> Another thing to worry about... On my big system, I only do a few
> mailings a week, but they bunch together. So I've had to do a
> bunch of work on making sure the system deals with this
> rationally...  when we were doing one mailing on a given day, that
> was easy, but we're doing both a text and an HTML variant going
> out together, and that really complicates life.

Yeah, volume, as versus number of messages, can be another problem
if only for disk IO.

> Well, this is probably preaching to the choir, but I've gotten
> quite convinced that you isolate every piece you can from every
> other piece, and document the interfaces. that makes it quite easy
> to swap out a new piece without affecting the rest of the system
...

This is often called, "programming by contract".  Its a Good Thing.

>> -- Allows archived messages to be replied to on the web via the
>> archive page (replies post to the list).

> Nice! does it restrict posting access to registered users or is it
> open?

I let the list server handle that.  As I insert a special header in
messages coming from the web interface its very easy to configure
Mailman to hold such posting for moderator approval (privacy options
page)

> I used to use it, and then switched my web archives to a full
> forum system (web crossing) and crosslinked everything. that has
> its advantages and disadvantages.

One of my list members has been advocating WebCrossing.  What do you
think of it?  It seemed excessively constraining to me, especially
since I'm heading toward a massively WikiWiki-fied setup (every page
can be commented on Wiki-style, all comments are free-standing Wiki
entities etc etc etc).

>> -- Supports archive searching by MessageID.  I've an MTA hack
>> that inserts a MessageID-based URL into all outgoing Mailman list
>> traffic so the user can just hit the URL and be taken to that
>> message in the archives (searches the MHonArc DB, useful for
>> thread reference etc).

> Interesting hack. Very interesting hack. 

<bow>  Wish it were original to me.  One of my list members came up
with the idea and then went and implemented it.  Its somewhere in
Keystone under Tasks...

> you could do something really nice with PHP and MySQL, too, and do
> away with MHonarc, and parse/templatize the text on the
> fly. that's sort of where I'm headed down the road....

Yeah, I've thought about that but I really just don't see enough
advantage to justify the time it would take to get something better
than I have now.  Eric Hood (MHonArc author) has been also
threatening to do something here for ages.

Its awfully tempting tho just on a "cool!" factor.

-- 
J C Lawrence                                 Home: claw@kanga.nu
----------(*)                              Other: coder@kanga.nu
--=| A man is as sane as he is dangerous to his environment |=--