[Mailman-Developers] Re-tries for failed SMTPDirect deliveries

Barry A. Warsaw bwarsaw@cnri.reston.va.us
Tue, 28 Mar 2000 12:12:00 -0500 (EST)


Last night, I added some code to queue messages that fail delivery
when using SMTPDirect.  What happens is this:

If a message either totally fails delivery (e.g. the smtp socket
connect fails) or partial delivery fails for some, but not all,
recipients, then the message is stored on the file system for a re-try
later.

For every failed message, two files are created.  The base name of
these files is the SHA hexdigest dump of the message text.  This
should be nearly guaranteed unique.  A new directory contains these
files, called `qfiles'.  The first file created is the complete plain
text of the failed message.  The second file is a marshal of useful
information related to the failed delivery.  This contains the
listname and the failed recip list along with a few other moderately
useful bits of info.

There's a new cron script called `qrunner' which cruise the files in
qfiles.  It claims a lock (to prevent multiple qrunner processes) and
then goes through each file it finds, attempting redelivery.  If there
are any problems reading a qfile file, it skips it for next time
(assumes it's a transient problem with the file, but logs a message).
When qrunner notices that the message has been handed off the the smtp
daemon for all outstanding recipients, it deletes the two message
files.

I've moderately tested this stuff with total delivery failure by
shutting off my smtp daemon, attempting some deliveries, turning it
back on and running qrunner.  I don't have the time right now to test
partial delivery failures, but I still claim that without DSN support,
these will be unlikely.  Hopefully some of you can help look at this.

I'm about to check all this stuff in.  Let me know what you think.
-Barry