[Mailman-Users] Mailman performance / sends per hour
Brad Knowles
brad.knowles at skynet.be
Sat Jul 26 12:03:57 CEST 2003
At 7:43 PM -0400 2003/07/25, Jon Carnes wrote:
> Actually Brad, it looks like your knowledge of Sendmail is rather dated.
> Sendmail has been doing this since 2001.
>
> http://www.sendmail.org/~ca/email/doc8.12/RELEASE_NOTES
This is old. Check the RELEASE_NOTES for version 8.12.9 (which
has a major security fix, and you are advised not to use any older
version of 8.12), or 8.12.10.Beta2 (which I quote here and dated Jul
1 05:08). The only references I can find to the word "sort" anywhere
in this file with regards to version 8.12 or later are:
8.12.7/8.12.7 2002/12/29
Do not lookup MX records when sorting the MSP queue. The MSP
only needs to relay all mail to the MTA. Problem found
by Gary Mills of the University of Manitoba.
Avoid problems with QueueSortOrder=random due to problems with
qsort() on Solaris (and maybe some other operating systems).
Problem noted by Stephan Schulz of Gruner+Jahr..
8.12.0/8.12.0 2001/09/08
If the new option FastSplit (defaults to one) has a value greater
than zero, it suppresses the MX lookups on addresses when they
are initially sorted which may result in faster envelope
splitting. If the mail is submitted directly from the
command line, then the value also limits the number of
processes to deliver the envelopes; if more envelopes are
created they are only queued up and must be taken care of
by a queue run.
QueueSortOrder=Random sorts the queue randomly, which is useful if
several queue runners are started by hand to avoid contention.
QueueSortOrder=Modification sorts the queue by the modification time
of the qf file (older entries first).
Note that none of these make any mention whatsoever to tracking
previous average delivery times for a recipient and using this as a
predictor for future average delivery times, and therefore sorting
the current input on this basis.
But please check again to make sure I didn't miss something. You
know me, I've only been mucking about with sendmail since ~1991, my
name only comes up in the full RELEASE_NOTES four times, I was only
the sendmail FAQ maintainer from ~1995 to ~1997, and I could easily
have forgotten or missed something.
> Postfix has some very interesting features that make it much better to
> use than Sendmail, but the one that sets it most apart in added
> efficiency is its default queueing structure.
You mean the hashed queues? Yes, that's good, but sendmail can
do better with the optional multiple queue structure. With this
option, sendmail gives you more control over how many queues are
created at what depth, instead of giving you an arbitrary number of
sixteen queue directories per hash level. Since most filesystems
start flaking out with more than about 1000 directory entries at a
single level, you can flatten the sendmail queue structure
significantly and still have fewer files per leaf directory node than
postfix would allow.
Moreover, it is the hashed queue structure that postfix uses, and
the way it uses the disk for queue management by moving files from
one directory structure to another, which causes the fundamental
performance limitations which sendmail allows you to exceed. Note
that sendmail never moves files around on-disk, and therefore does
not result in additional unnecessary synchronous meta-data updates.
Indeed, with the safe asynchronous writes feature, sendmail can
safely avoid causing any asynchronous meta-data updates at all for
most cases, as the mail messages are small enough that they can be
buffered in memory and delivered on the initial delivery attempt.
Only large messages or messages that fail the initial delivery
attempt end up getting written to disk at all, which means that
sendmail can approach pure RAM/network I/O throughput speeds whereas
postfix will always be bound by disk I/O.
> I do agree with you though, that if the MTA (or Mailman) could
> periodically sweep the MTA delivery logs and sort the domains from
> fastest to slowest, there would be an increase in efficiency.
This is the feature *I* was talking about, although I'd be
inclined to do it on an individual basis and not a domain basis,
since some individuals might have .procmailrc or other processing
scripts on the remote end that might be significantly slower to
process than other recipients within the same domain.
For situations where this is not an issue at the remote end, the
problem would largely solve itself because all those recipients would
tend to sort together anyway.
> For larger lists and Mailman, I have found that nothing beats using a
> RAM disk and accessing the list database files via the mounted RAM disk.
> The speed increase can be 100x faster.
If you're going to be a professional spammer, then I would
suggest using the professional spammer tools.
Otherwise, if you're going to run a mailing list for normal
people, then I would suggest that you pay attention to sections 5.3.3
and 5.3.4 of RFC 1123 "Internet Host Requirements", which is also
part of STD0003:
5.3.3 Reliable Mail Receipt
When the receiver-SMTP accepts a piece of mail (by sending a
"250 OK" message in response to DATA), it is accepting
responsibility for delivering or relaying the message. It must
take this responsibility seriously, i.e., it MUST NOT lose the
message for frivolous reasons, e.g., because the host later
crashes or because of a predictable resource shortage.
If there is a delivery failure after acceptance of a message,
the receiver-SMTP MUST formulate and mail a notification
message. This notification MUST be sent using a null ("<>")
reverse path in the envelope; see Section 3.6 of RFC-821 . The
recipient of this notification SHOULD be the address from the
envelope return path (or the Return-Path: line). However, if
this address is null ("<>"), the receiver-SMTP MUST NOT send a
notification. If the address is an explicit source route, it
SHOULD be stripped down to its final hop.
DISCUSSION:
For example, suppose that an error notification must be
sent for a message that arrived with:
"MAIL FROM:<@a, at b:user at d>". The notification message
should be sent to: "RCPT TO:<user at d>".
Some delivery failures after the message is accepted by
SMTP will be unavoidable. For example, it may be
impossible for the receiver-SMTP to validate all the
delivery addresses in RCPT command(s) due to a "soft"
domain system error or because the target is a mailing
list (see earlier discussion of RCPT).
To avoid receiving duplicate messages as the result of
timeouts, a receiver-SMTP MUST seek to minimize the time
required to respond to the final "." that ends a message
transfer. See RFC-1047 [SMTP:4] for a discussion of this
problem.
In particular, this means that you can't use a RAM disk for this
application. You *could* use a battery-backed solid-state disk, so
long as you could guarantee that it is configured in such a way that
it will survive power loss, reboots, remounting, filesystem check,
etc.... Of course, proper SSD is much, much more expensive than a
simple RAM disk.
The alternative is using sendmail with the above-mentioned safe
asynchronous writes feature, which allows you to get full use of your
RAM, at nearly RAM disk speeds, but to do so safely.
--
Brad Knowles, <brad.knowles at skynet.be>
"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
More information about the Mailman-Users
mailing list