[Mailman-Users] big lists, big messages

Tib tib at tigerknight.org
Sun May 13 09:22:08 CEST 2001


On Sat, 12 May 2001, J C Lawrence wrote:
> Your math is off as it ignores RCPT-TO envelope size.

So throw in a few more bytes times 10k and it gets even bigger


> 1) If your messages are getting corrupted, AT ALL, you have far more
> serious problems than how fast your system is able to deliver a list
> broadcast.  Something is fundamentally broken and that needs to be
> fixed, now, before you start worrying about much else.
>
> 2) Transfer failures given a good MTA and reasonable choice of RCPT
> TO bundle size should cause minimal problem in delivery rates for
> the list broadcast.  Empirical testing here, for my admittedly very
> atypical membership/domain distribution suggests that between 5 and
> 25 is my sweet spot under Postfix.  Chuq IIRC has found for his
> locad under Sendmail that somewhere in the 30 range is his sweet
> pot.  Vour mileage will vary.
>
> 3) If delivery failures are clogging your MTA queue and are
> noticably slowing delivery rates, you need to start thinking about
> reviewing your MTA configuration or using a different and more
> intelligent MTA.

Point

> Actually, list servers are generically disk IO bound with the
> primary factor in the disk IO being open/close/unlink time not
> read/write time.

I'm not quite the hardware guru I'd like to be yet - explain in english please?

> This assumes of course that the audience has web access, and in
> particular has web access at the time and on the device they would
> normally read the messages.
>
>   Example : It wouldn't work for me reading on the train on my
>   laptop.

Who has email that does not have web access at the time they get their email?
Granted I suppose it's possible that you would download your mail ahead of time
before leaving home and then opening your laptop on the train.

> Result?
>
>   I get to read what I'm interested in on the list, and any time I
>   post to the list, I get to see everything on that thread until it
>   dies.  Meanwhile the rest of the list passes me silently by.  I
>   can of course go read the main list folder any time I want, which
>   I do periodically to update the key word lists -- but usually its
>   enough to just read -interesting.
>
> That sort of autmation would be simply impossible with the web-based
> distribution you describe.

<condenses a couple emails to respond to a few different points in one email>

The load shift is a valid concern, and does change from a 'push' to a 'pull'
format. True: users who have a bland interest in the article that is being
posted will probably not pull down the URL - what's the problem with this, it's
saved bandwidth. Anyone who is /actually/ interested and can take the time to
read/parse through 30k worth of email will take an extra second to click on a
link or do what not to pull up what they want. And I do understand that this
means that a person may actually pull on that link a number of times on
different occasions. However it's now pulling on a webserver rather than
pushing through an email server, which depending on it's configuration may have
as few as one outward connection at a time (which I'm not sure how rare /that/
is, my original sendmail server only had 1 outbound, but my current qmail setup
allows up to 50 outward smtp connections at once), and a webserver is designed
to be pulled on a lot. Most basic configurations will start anywhere from 10 to
20 instances at once, and the 'high bandwidth' demand won't be high at all if
you keep the presentation simple and clean and non-graphicy - just you like
you'd get in a 30k message that got pushed out to you. Plus, if you really want
to get finicky, the manner of processing http requests churns out fewer headers
and data than mail.

Also, unless 100% of your user base is actually reading the article, you're
going to be saving bandwidth (some people may open an article more than just
once, but most will read it once and be done, in which case you will come out a
little bit ahead instead of a lot). The load of an entire batch (rough 300meg
estimate) will also be spread out more over the course of a few days as
everyone checks their mail and may or may not look at that message and cause it
to draw on the url or click/paste it into a browser themselves.


It all boils down to a matter of how you want to use your server. There are as
many good points for as against everything I said and have been responded to
with, however all the items presented have not really made an effect on my
point of view. There are two sides to this debate, the client side (where the
rebuttal for my view came from), and the server side (which I was presenting).


If you want the best of both worlds, perhaps consider doing similar to many
newspapers have done with their web presence/maling list user base. Send out a
smaller email message which briefs the content of the full article and then
present a link at the end which lets users get an idea of the article rather
than flying blind on whether or not to follow the link or not. Cut a 30k
message to 10k users down to maybe 5k in this way and you'll have a userbase
that doesn't have the entire message in their box, but at the same time is not
ignorant to the progression of the newsletter's topic.

<EOL>
Tib





More information about the Mailman-Users mailing list