[Mailman-Users] Follow up to a Lock problem.

Mark Sapiro mark at msapiro.net
Thu Oct 28 23:50:14 CEST 2010


Llewellyn Curran wrote:
>
>I have included some errors from the locks log below. Can you
>assist me?
>
>Errors:
>[root at x mailman]# tail -50 /var/log/mailman/locks
>Oct 26 21:13:56 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 115, in _dispose
>Oct 26 21:13:56 2010 (17763)
>mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT)
>Oct 26 21:13:56 2010 (17763)   File "/usr/lib/mailman/Mailman/MailList.py",
>line 161, in Lock
>Oct 26 21:13:56 2010 (17763)     self.__lock.lock(timeout)
>Oct 26 21:13:56 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 287, in lock
>Oct 26 21:13:56 2010 (17763)     self.__linkcount(), important=True)
>Oct 26 21:13:56 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 416, in __writelog
>Oct 26 21:13:56 2010 (17763)     traceback.print_stack(file=logf)
>Oct 26 21:13:58 2010 (17763) xlt.lock unexpected linkcount: 1
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/bin/qrunner", line
>278, in ?
>Oct 26 21:13:58 2010 (17763)     main()
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/bin/qrunner", line
>238, in main
>Oct 26 21:13:58 2010 (17763)     qrunner.run()
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 71, in run
>Oct 26 21:13:58 2010 (17763)     filecnt = self._oneloop()
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 112, in _oneloop
>Oct 26 21:13:58 2010 (17763)     self._onefile(msg, msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 170, in _onefile
>Oct 26 21:13:58 2010 (17763)     keepqueued = self._dispose(mlist, msg,
>msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/VirginRunner.py", line 38, in _dispose
>Oct 26 21:13:58 2010 (17763)     return IncomingRunner._dispose(self, mlist,
>msg, msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 115, in _dispose
>Oct 26 21:13:58 2010 (17763)
>mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/MailList.py",
>line 161, in Lock
>Oct 26 21:13:58 2010 (17763)     self.__lock.lock(timeout)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 287, in lock
>Oct 26 21:13:58 2010 (17763)     self.__linkcount(), important=True)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 416, in __writelog
>Oct 26 21:13:58 2010 (17763)     traceback.print_stack(file=logf)
>Oct 26 21:13:58 2010 (17763) xlt.lock unexpected linkcount: 1
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/bin/qrunner", line
>278, in ?
>Oct 26 21:13:58 2010 (17763)     main()
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/bin/qrunner", line
>238, in main
>Oct 26 21:13:58 2010 (17763)     qrunner.run()
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 71, in run
>Oct 26 21:13:58 2010 (17763)     filecnt = self._oneloop()
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 112, in _oneloop
>Oct 26 21:13:58 2010 (17763)     self._onefile(msg, msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/Runner.py", line 170, in _onefile
>Oct 26 21:13:58 2010 (17763)     keepqueued = self._dispose(mlist, msg,
>msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/VirginRunner.py", line 38, in _dispose
>Oct 26 21:13:58 2010 (17763)     return IncomingRunner._dispose(self, mlist,
>msg, msgdata)
>Oct 26 21:13:58 2010 (17763)   File
>"/usr/lib/mailman/Mailman/Queue/IncomingRunner.py", line 115, in _dispose
>Oct 26 21:13:58 2010 (17763)
>mlist.Lock(timeout=mm_cfg.LIST_LOCK_TIMEOUT)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/MailList.py",
>line 161, in Lock
>Oct 26 21:13:58 2010 (17763)     self.__lock.lock(timeout)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 287, in lock
>Oct 26 21:13:58 2010 (17763)     self.__linkcount(), important=True)
>Oct 26 21:13:58 2010 (17763)   File "/usr/lib/mailman/Mailman/LockFile.py",
>line 416, in __writelog
>Oct 26 21:13:58 2010 (17763)     traceback.print_stack(file=logf)
>[root at x mailman]#


When we attempt to obtain a lock, the process attempting the lock first
writes a file named, e.g. <listname>.lock.<hostname>.<pid>.<counter>
and then attempts to create a hard link to that file named
<listname>.lock.

In the case above, the OS returned an EEXIST error to the attempted
link meaning that the <listname>.lock file existed (xlt.lock in this
case), but 'unexpected linkcount: 1' says that file is not linked to
any <listname>.lock.<hostname>.<pid>.<counter> file or any other file.

When this happens, the contents of that xlt.lock file should give the
hostname and pid of the process that obtained the lock. That may help.

Also see the FAQ at <http://wiki.list.org/x/_4A9> for information about
ensuring that only one mailmanctl and one set of qrunners are running
and make sure that's the case.

-- 
Mark Sapiro <mark at msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan



More information about the Mailman-Users mailing list