[Mailman-Users] Qruenner Locking/NFS issue.

Matt Ruzicka matt at frii.com
Sat May 21 00:43:38 CEST 2005


The other day we had the admin of one of our lists approve a held message
and have it "disappear".

Looking through the logs we were able to find this in the
mailman/logs/vette log:

May 18 09:14:54 2005 (502) LISTNAME post from PERSON at DOMAIN.TLD held,
message-id=<05d701c55bbc$3187f4a0$0700005a at COMPUTER>: Post to moderated list
May 18 09:17:29 2005 (7895) held message approved, message-id:
<05d701c55bbc$3187f4a0$0700005a at COMPUTER>

Looking in the mailman/logs/error log we found this:

May 18 09:17:44 2005 (502) Uncaught runner exception: [Errno 70] Stale NFS
file handle
May 18 09:17:44 2005 (502) Traceback (most recent call last):
  File /PATH/TO/mailman/Mailman/Queue/Runner.py, line 111, in _oneloop
    self._onefile(msg, msgdata)
  File /PATH/TO/mailman/Mailman/Queue/Runner.py, line 167, in _onefile
    keepqueued = self._dispose(mlist, msg, msgdata)
  File /PATH/TO/mailman/Mailman/Queue/IncomingRunner.py, line 115, in
_dispose
    mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT)
  File /PATH/TO/mailman/Mailman/MailList.py, line 159, in Lock
    self.__lock.lock(timeout)
  File /PATH/TO/mailman/Mailman/LockFile.py, line 288, in lock
    elif self.__read() == self.__tmpfname:
  File /PATH/TO/mailman/Mailman/LockFile.py, line 431, in __read
    filename = fp.read()
IOError: [Errno 70] Stale NFS file handle

May 18 09:17:44 2005 (502) SHUNTING:
1116429293.5192831+15acc1f8325f7d3428d8f80d85bc7979fd103ce2

I've searched for "Uncaught runner exception" and NFS (as well as a number
of other variations) and wasn't able to find what I was looking for.

Now obviously we have something bad happening with NFS, but is it just
coincidental that that 15 seconds elapsed between the message being
approved and the error occurring since our DEFAULT_LOCK_LIFETIME value is
still set to the default 15 seconds?

Could we mitigate NFS errors like this if we were to tweak the
DEFAULT_LOCK_LIFETIME value?

We are currently running the mail handling portion of Mailman out of an
NFS mount on our FreeBSD mail server, and the web interface is served off
load balanced web servers that share the same NFS mounted directory.

Also, does anyone know where that message might have actually gone?

Thanks.

Matthew Ruzicka - Systems Administrator
Front Range Internet, Inc.
matt at frii.net - (970) 212-0728

Got SPAM?  Take back your email with MailArmory.  http://www.MailArmory.com



More information about the Mailman-Users mailing list