[Bug 1839438] [NEW] ESTALE in cluster configuration

Hiroyuki Homma 1839438 at bugs.launchpad.net
Thu Aug 8 04:00:41 EDT 2019


Public bug reported:

I am running a Mailman cluster with two servers sharing archives/data/lists/locks/spam directories.
qfiles/logs directories are placed on each server's local volumes.

Our environment is:
CentOS 7.6
Mailman 2.1.29
GlusterFS 5.2 for shared volume.

When I sent 1000 messages to the same list in 500 seconds (2 messages
per second), about 20 messages has been shunted because of 'Stale file
handle' error.

Aug 06 15:14:45 2019 (15817) Uncaught runner exception: [Errno 116] Stale file handle
Aug 06 15:14:45 2019 (15817) Traceback (most recent call last):
  File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 119, in _oneloop
    self._onefile(msg, msgdata)
  File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 165, in _onefile
    mlist = self._open_list(listname)
  File "/usr/lib/mailman/Mailman/Queue/Runner.py", line 208, in _open_list
    mlist = MailList.MailList(listname, lock=False)
  File "/usr/lib/mailman/Mailman/MailList.py", line 133, in __init__
    self.Load()
  File "/usr/lib/mailman/Mailman/MailList.py", line 692, in Load
    dict, e = self.__load(file)
  File "/usr/lib/mailman/Mailman/MailList.py", line 663, in __load
    dict = loadfunc(fp)
IOError: [Errno 116] Stale file handle

Aug 06 15:14:45 2019 (15817) SHUNTING:
1565072084.945903+914cbad4e11aaa0523b7492edba5f4836db939d1

This happens when the recipient list's config.pck file is replaced by another server while reading it.
ESTALE could happen normally on shared volumes, and in most case, simply retrying open/read is sufficient to recover the error.
So I think a retry logic should be implemented in MailList._load() method.

** Affects: mailman
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Mailman
Coders, which is subscribed to GNU Mailman.
https://bugs.launchpad.net/bugs/1839438

Title:
  ESTALE in cluster configuration

To manage notifications about this bug go to:
https://bugs.launchpad.net/mailman/+bug/1839438/+subscriptions


More information about the Mailman-coders mailing list