[Mailman-Users] How many messages in an envelope?

J C Lawrence claw at 2wire.com
Wed Apr 25 05:08:28 CEST 2001


On Tue, 24 Apr 2001 15:53:47 -0700 (PDT) 
alex wetmore <alex at phred.org> wrote:

> On Tue, 24 Apr 2001, Erik Parker wrote:
>> > Is there a way to bump this down to say.. 2 people per
>> envelope? Vs.. The > 20 or 30 it seems to put in now?
>> 
>> Sorry to waste your bandwidth, I found what I was looking for in
>> the Defaults.py .. Default was 500, I bumped it down to 2.. What
>> do you know, sending messages to 25000 people at a time now, goes
>> quite fast.

> Increasing this number should result in lower bandwidth usage and
> higher performance.  

This is not necessarily true, and depends heavily on the total
number of spool entries and the level of contention among queue
runners for deliveries as it reduces the total level of parallelism
possible in the delivery process.  It also depends on the average
load on your mail spool.  If your spool is constantly busy then lock
contention will be consequently lower as the spool never drains and
there are always other entries to process.

Consider the simple case for your 1K count envelope.  The gain is
that due to the low number of transactions the handoff to the MTA is
rapid.  The problem is that without a constantly active mail server,
the delivery rate by the MTA will tend to be low.

  For a list with 25K members there are now 25 spool entries.  This
  means that there can be at most 25 queue runners.  Problem is,
  some number of those queue runners are going to be locked at any
  given instant dealing with slow/unresponsive MXes, leaving your
  total "live" count of actively delivering queue runners as a much
  smaller (and rapidly decreasing as the situation becomes more
  pessimal across the queue lifetime).  

  Conversely, setting the max envelope size to a small value, say
  10, will result in 2.5K queued entries for the same list post.
  Given an MTA configured for say 50 queue runners (depending on
  system I typically run between 20 - 80 with a max of 10 per target
  MX) you'll likely find that you can maintain queue runner
  saturation across the majority of the queue run time (depends on
  the distribution of your targets).  Less contention over queue
  entries equals greater parallelism equals faster delivery times.

ObExample: With a 2K member list with envelope seize set to 500 on
my main test system under Postfix it takes roughly 14 minutes to
drain the queue of responsive MXes.  After reducing the envelope
size of 5, the drain time is now just over 3 minutes before the
spool becomes quiescant (only slow MXes left).  Tha gating factor on
that 3 minutes is transactional overhead on the deliveries, not disk
IO or bandwidth.

Further, values above ~20 will tend to result in your mail being
silently dropped/deleted by AOL.  Its not clear what the exact
trigger value is for AOL, and of course they don't publish it, but
empirical testing suggests that it is currently 17.

> As an example, assume that your list has 1000 users on
> hotmail.com.  If you are sending messages with 2 recipients then
> the best that your MTA can do is to batch two hotmail.com
> recipients together.

Codswhallop.  Intelligent MTAs upon successfully connecting to an MX
will fork multiple queue runners to that target attempting to
deliver N parallel streams to the MX until the queue for that MX is
dry.  Even sendmail (finally) does this.

> If you are sending with 500 envelope recipients then it might be
> able to batch half of the hotmail.com recipients together in one
> shot.  
 
What the greater number of entries in the envelope gains you is:

  -- Faster handoff to MTA time
  -- Lower transaction rate
  -- Lower bandwidth requirements
  -- Lower disk IO
  -- Generally lower system load (outside of MTA configuration)

What is costs you:

  -- Lower parallelism opportunities
  -- Greater spool entry contention

What works best for you will depend on your cost structure, your
mail load, and the distribution (and responsiveness) of your target
MXes.  You are trading off parallelism against bandwidth in the end,
with a second order trade of diskIO against transaction rate behind
that.  

Where your sweet spot is will depend on your load.  For my setup ( a
generally idle system doing ~200K outbound deliveries per day a
small envelope size under postfix (used to be Exim) works well as I
can effectively maintain a high level of parallelisation (50 queue
runners, 10 per MX) during the deliver period.  With an envelope
size of 10 (versus 5) my queue runtime increases by almost 60%.
Moving to 100 almost trebles the queue run time (the graph seems to
flatten after about 70 as lock contention becomes the binding
factor).

  ObExcuse: I previously reported that I averaged in the small
  millions for delivery rate.  I was wrong.  My stat tool (a little
  script foo under UCD-SNMP) was, umm, buggy, and was reporting
  wrongly high values.  My actual load averages between 110K - 500K
  deliveries per day, with occassional bursts/rare bursts up to the
  0.8M - 1.1M range.

On very busy days (I get bursts like this about once a month) my
main box will ship between 0.8M - 1.1M messages a day.  That
summates to 160K - 220K spool entries per day, which maps out to an
average transaction rate of 110 - 152 spool entries per minute.  At
that point the system is compleatly saturated and disk IO is the
gating factor (killing syslog helps a LOT, as does doing some level
of domain routing to smarthosts).

Were that condition a regular occurance AND Mailman did domain
sorting and grouping, then increasing envelope size would make sense
as it would decrease the disk IO rate, allowing me to start getting
to the point where outbound bandwidth AND diskIO were equally gating
factors.  My sense, and this is a guesstimate, is that an envelope
size of around 30 - 50 would be the sweet spot for that condition (I
have a very heavily spread MX base).

> As far as I know Mailman doesn't try to batch recipients by domain
> though...

IIRC (I can't check right now) Mailman does do domain sorting, but
does not yet do domain grouping. 

-- 
J C Lawrence                                       claw at kanga.nu
---------(*)                          http://www.kanga.nu/~claw/
--=| A man is as sane as he is dangerous to his environment |=--




More information about the Mailman-Users mailing list