[Mailman-Developers] race condition in locking ?

Thomas Wouters thomas@xs4all.net
Thu, 3 Feb 2000 10:35:55 +0100


--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii

On Wed, Feb 02, 2000 at 05:20:37PM -0500, Barry A. Warsaw wrote:

> I've only skimmed your emails but I think your analysis is basically
> correct, and the solution you propose here seems reasonable.  This
> mailing list is the right place to discuss it, and although I've been
> too busy to respond lately, I /will/ see your message, and your
> changes.

Okay, I figured you're a busy man, thanx for giving me a fast answer.

> My suggestions would be these: be sure to look at the latest version
> of LockFile.py from the CVS repository.  It doesn't fix your problems
> but it embodies the current semantics I want from the API.

I was already using the CVS version, so no worries.

> Second: if you submit substantial changes (as I suspect you will),
> we'll need to get FSF assignments from you.  I'll send you the
> appropriate form when I see how big your changes are, but I wanted to
> give you a heads up.

Well, attached you'll find the first change, the cvs diff to LockFile.py. My
apologies for this rather long email, but it's as much an explanation to
this list about the changes I made, as a way for me to get this info out of
my subconcious and into straight english. No need to read this all if you
read the source and understand it, or if you dont care about locking ;)

I kept the API the same with one teensy exception. With the new locking
scheme, it's not possible to 'steal' a lock, only to break the lock and then
enter the lock() cycle to try and grab it. So steal() might block
indefinately (though this should only occur on very heavily loaded machines,
with a lot of mailmen trying to grab the same lock.)

I've also added some semi-random (more like 'not very regular') factor to
the default sleep timeout, to slightly try to avoid running processes 'in
stride'. In the previous locker, processes that were walking in stride would
be guaranteed not to get the lock, and could only get out of it by
irregularities in the python execution speed and/or OS speed. In the new
locker collisions are much less harmful, but it's still nice to try and
avoid running in stride.

This version still has some debug info inserted, to test the timing of the
locks and the like, and it could be speeded up a fair bit once we're
satisfied that it works properly. Basically, how it works is like this:

The locker creates a private lockfile (__tmpfname) containing pid, tmpfname
and expected lifetime, and tries to make __lockfile a hard link to this
file. This'll fail when __lockfile already exists, or when __tmpfname has
been deleted (this shouldn't happen, though.)

If the link() fails, the locker tries to read the contents of the lockfile,
to find out who is locking it. If the read fails (the file does not exist,
or the contents was of unexpected format) or the lock is too old, it tries
to break the lock, using __break(). __break() checks __lockfile's CTIME, the
inode change time, to see if it has changed recently -- which would indicate
that the lock-owner is still alive, and that the lock contents might have
changed since we last checked it.

(CTIME is updated by writing to it, opening it for writing, and link()ing,
but not by reading from it. We only ever write/link to our own lockfile, not
the global one, so if there is write activity, it wasn't us. We do have to
flush() or close() after write()ing, however, to make sure the changes get
written out.)

Note that __write() writes to the private lockfile, whereas __read() reads
from the global one, and that there is a lot of redundancy in the
lock-checking; ordinarily, the nlink-check should be enough to determine
wether we have the lock or not. Also note that the locker no longer cleans
up other mailmens lockfiles. If a mailman crashes and leaves its own
lockfile, the file will not be deleted -- perhaps a default cronjob to clear
old locks is in order, I'll think about that. In any case, old lockfiles
shouldn't be a problem even if another mailman comes along on the same
machine, with the same pid, trying to lock the same file, as it will simply
use the file already there, and delete it when it's done.

The other api functions (refresh, set_lifetime, unlock, locked) all should
work exactly the same, only better -- in particular refresh() no longer
exposes a race condition, thanks to the ctime check in __break(). And
because the new locking scheme guarantees that the first attempt to get the
lock after an unlock(), will succeed, the default sleep interval can be
safely raised. I've set the default to 1 second, which is just above the
average lock age on my listserver, given decent load.

I've tested this locker quite heavily, mailbombing my listserver with 100+
simultaneous messages to mailman-test (which generated 100+ bounces sent
back to mailman, because my bouncing email address is still on the list) and
it hasn't barfed yet ;)

-- 
Thomas Wouters <thomas@xs4all.net>

Hi! I'm a .signature virus! copy me into your .signature file to help me spread!

--J2SCkAp4GZ/dPZZf
Content-Type: text/plain; charset=us-ascii
Content-Disposition: attachment; filename="mailman.cvs.diff"

Index: Mailman/LockFile.py
===================================================================
RCS file: /projects/cvsroot/mailman/Mailman/LockFile.py,v
retrieving revision 1.13
diff -u -r1.13 LockFile.py
--- LockFile.py	1999/12/14 04:22:05	1.13
+++ LockFile.py	2000/02/03 09:30:33
@@ -27,6 +27,8 @@
 import errno
 #from stat import ST_NLINK
 ST_NLINK = 3                                      # faster
+ST_CTIME = 9
+ST_INO   = 1
 
 
 # default intervals are both specified in seconds, and can be floating point
@@ -36,7 +38,8 @@
 # sleep before checking the lock's status, if we were not the winning claimant 
 # the previous time around.
 DEFAULT_LOCK_LIFETIME   = 15
-DEFAULT_SLEEP_INTERVAL = .25
+DEFAULT_SLEEP_INTERVAL  = 1.0
+DEFAULT_INODE_GRACETIME = 2
 
 
 from Mailman.Logging.StampedLogger import StampedLogger
@@ -79,20 +82,21 @@
 
         lifetime is the maximum length of time expected to keep this lock.
         This value is written into the lock file so that other claimants on
-        the lock know when it is safe to steal the lock, should the lock
+        the lock know when it is safe to break the lock, should the lock
         holder be wedged.
 
-        sleep_interval is how often to wake up and check the lock file
+        sleep_interval is roughly how often to wake up and check the lock file
 
         """
         self.__lockfile = lockfile
         self.__lifetime = lifetime
-        self.__sleep_interval = sleep_interval
+        self.__sleep_interval = sleep_interval + (((os.getpid() % 20) - 11.5)/100.)
         self.__tmpfname = "%s.%s.%d" % (lockfile,
                                         socket.gethostname(),
                                         os.getpid())
         self.__withlogging = withlogging
-        self.__kickstart()
+        self.__write()
+        self.__writelog("starting")
 
     def set_lifetime(self, lifetime):
         """Reset the lifetime of the lock.
@@ -129,28 +133,6 @@
         if self.locked():
             self.unlock()
 
-    def __kickstart(self, force=0):
-        # forcing means to remove the original lock file, and create a new
-        # one.  this might be necessary if the file contains bogus locker
-        # information such that the owner of the lock can't be determined
-        if force:
-            try:
-                os.unlink(self.__lockfile)
-            except IOError:
-                pass
-        if not os.path.exists(self.__lockfile):
-            try:
-                # make sure it's group writable
-                oldmask = os.umask(002)
-                try:
-                    file = open(self.__lockfile, 'w+')
-                    file.close()
-                finally:
-                    os.umask(oldmask)
-            except IOError:
-                pass
-        self.__writelog('kickstarted')
-
     def __write(self):
         # we expect to release our lock some time in the future.  we want to
         # give other claimants some clue as to when we think we're going to be
@@ -169,22 +151,38 @@
             os.umask(oldmask)
 
     def __read(self):
-        # can raise ValueError in two situations:
+        # can raise ValueError in three situations:
         #
-        # either first element wasn't an integer (a valid pid), or we didn't
-        # get a sequence of the right size from the string.split.  Either way,
-        # the data in the file is bogus, but this is caught and handled higher
-        # up.
-        fp = open(self.__tmpfname, 'r')
+        # - The lockfile does not exist (it's been released or broken)
+        # - We didn't get a sequence of the right size from split
+        # - the first 'word' is not an integer
+        # In each case we raise ValueError in response
+        try:
+            fp = open(self.__lockfile, 'r')
+        except IOError, err:
+            if err.errno == errno.ENOENT:
+                raise ValueError # invalid contents
         try:
             pid, winner, lockrelease = string.split(string.strip(fp.read()))
         finally:
             fp.close()
         return int(pid), winner, float(lockrelease)
 
-    # Note that no one new can grab the lock once we've opened our tmpfile
-    # until we close it, even if we don't have the lock.  So checking the PID
-    # and stealing the lock are guaranteed to be atomic.
+    def __break(self):
+       try:
+           if os.stat(self.__lockfile)[ST_CTIME] > time.time() - DEFAULT_INODE_GRACETIME:
+               # The lockfile has changed very recently, so it might have
+               # changed since our caller evaluated the validity
+               return
+       except OSError, err:
+           if err.errno == errno.ENOENT:
+               return # Someone broke it already
+           raise
+       os.unlink(self.__lockfile)
+
+    # Note that this function is mostly redundant, the try/except around
+    # os.link does almost all of what we do. Instead of lockfile-contents
+    # we could use lockfile CTIME, which would probably speed things up.
     def lock(self, timeout=0):
         """Blocks until the lock can be obtained.
 
@@ -197,21 +195,42 @@
         synchronized.
 
         """
-        if timeout > 0:
-            timeout_time = time.time() + timeout
-        last_pid = -1
         if self.locked():
             self.__writelog('already locked')
             raise AlreadyLockedError
-        stolen = 0
+        if timeout:
+            timeout_time = time.time() + timeout
+        last_pid = -1
+        self.__write() # Make sure our own lockfile exists
         while 1:
             # create the hard link and test for exactly 2 links to the file
-            os.link(self.__lockfile, self.__tmpfname)
+            try:
+                os.link(self.__tmpfname, self.__lockfile)
+            except OSError, err:
+                if err.errno == errno.ENOENT:
+                    # Our own lockfile has vanished from beneath us ?
+                    self.__write()
+                    # sleep anyway, so we dont hog the machine
+                    self.__sleep()
+                    continue
+                elif err.errno == errno.EEXIST:
+                    # We failed to get the lock. If we worry about this locking
+                    # scheme, we set a variable here and ascertain ourselves of 
+                    # its value later, when we think we have the lock.
+                    pass
+                else:
+                    raise
+
             if os.stat(self.__tmpfname)[ST_NLINK] == 2:
-                # we have the lock (since there are no other links to the lock
-                # file), so we can piss on the hydrant
+                # we have the lock (since noone overwrote our link to the lock
+                # file), so we re-piss our hydrant, to refresh the lock timeout
                 self.__write()
-                self.__writelog('got the lock')
+                now = time.time()
+                if not self.locked():
+                    self.__writelog('false positive on lockfile (%f)'%now)
+                    self.__sleep()
+                    continue
+                self.__writelog('got the lock (%f)'%now)
                 break
             # we didn't get the lock this time.  let's see if we timed out
             if timeout and timeout_time < time.time():
@@ -220,83 +239,34 @@
                 raise TimeOutError
             # someone else must have gotten the lock.  let's find out who it
             # is.  if there is some bogosity in the lock file's data then we
-            # will steal the lock.
+            # will try to steal the lock.
             try:
                 pid, winner, lockrelease = self.__read()
             except ValueError:
-                os.unlink(self.__tmpfname)
-                self.__kickstart(force=1)
+                self.__break()
+                self.__sleep()
                 continue
-            # If we've gotten to here, we should not be the winner, because
-            # otherwise, an AlreadyCalledLockError should have been raised
-            # above, and we should have never gotten into this loop.  However, 
-            # the following scenario can occur, and this is what the stolen
-            # flag takes care of:
-            #
-            # Say that processes A and B are already laying claim to the lock
-            # by creating link files, and say A actually has the lock (i.e., A
-            # is the winner).  We are process C and we lay claim by creating a
-            # link file.  All is cool, and we'll trip the pid <> last_pid test
-            # below, unlink our claim, sleep and try again.  Second time
-            # through our loop, we again determine that A is the winner but
-            # because it and B are swapped out, we trip our lifetime test
-            # and figure we need to steal the lock.  So we piss on the hydrant
-            # (write our info into the lock file), unlink A's link file and go
-            # around the loop again.  However, because B is still laying
-            # claim, and we never knew it (since it wasn't the winner), we
-            # again have 3 links to the lock file the next time through this
-            # loop, and the assert will trip.
-            #
-            # The stolen flag alerts us that this has happened, but I still
-            # worry that our logic might be flawed here.
-            assert stolen or winner <> self.__tmpfname
-            # record the identity of the previous winner.  lockrelease is the
-            # expected time that the winner will release the lock by.  we
-            # don't want to steal it until this interval has passed, otherwise 
-            # we could steal the lock out from underneath that process.
+
+            # check for hanging processes, by remembering pid, and if it stays
+            # the same, checking lockrelease.
             if pid <> last_pid:
                 last_pid = pid
-            # here's where we potentially steal the lock.  if the pid in the
-            # lockfile hasn't changed by lockrelease (a fixed point in time),
-            # then we assume that the locker crashed
+
             elif lockrelease < time.time():
-                self.__write()                # steal
-                self.__writelog('stolen!')
-                stolen = 1
-                try:
-                    os.unlink(winner)
-                except os.error:
-                    # winner lockfile could be missing
-                    pass
-                try:
-                    os.unlink(self.__tmpfname)
-                except os.error, (code, msg):
-                    # Let's say we stole the lock, but some other process's
-                    # claim was never cleaned up, perhaps because it crashed
-                    # before that could happen.  The test for acquisition of
-                    # the lock above will fail because there will be more than
-                    # one hard link to the main lockfile.  But we'll get here
-                    # and winner==self.__tmpfname, so the unlink above will
-                    # fail (we'll have deleted it twice).  We could just steal
-                    # the lock, but there's no reliable way to clean up the
-                    # stale hard link, so we raise an exception instead and
-                    # let the human operator take care of the problem.
-                    if code == errno.ENOENT:
-                        self.__log('stale lockfile found')
-                        raise StaleLockFileError(
-                            'Stale lock file found linked to file: '
-                            +self.__lockfile+' (requires '+
-                            'manual intervention)')
-                    else:
-                        raise
+                # We've waited long enough...
+                # Attempt to break the lock. We dont care much about the old
+                # winners __tmpfname, let it stay.
+                self.__break()
+                self.__sleep()
                 continue
-            # okay, someone else has the lock, we didn't steal it, and our
-            # claim hasn't timed out yet.  So let's wait a while for the owner
-            # of the lock to give it up.  Unlink our claim to the lock and
-            # sleep for a while, then try again
-            os.unlink(self.__tmpfname)
+            # okay, someone else has the lock, we weren't able to steal it,
+            # and our claim hasn't timed out yet.  So let's wait a while for
+            # the owner of the lock to give it up.
             self.__writelog('waiting for claim')
-            time.sleep(self.__sleep_interval)
+            self.__sleep()
+    
+    def __sleep(self):
+        time.sleep(self.__sleep_interval)
 
     # This could error if the lock is stolen.  You must catch it.
     def unlock(self):
@@ -308,24 +278,39 @@
         """
         if not self.locked():
             raise NotLockedError
+        now = time.time()
+        os.unlink(self.__lockfile)
         os.unlink(self.__tmpfname)
-        self.__writelog('unlocked')
+        self.__writelog('unlocked (%f)'%now)
 
     def locked(self):
         """Returns 1 if we own the lock, 0 if we do not."""
         if not os.path.exists(self.__tmpfname):
             return 0
+        if os.stat(self.__tmpfname)[ST_NLINK] <> 2:
+            return 0
+        # The above check for nlink should be enough to validate our claim on
+        # the lock. Nevertheless, we are paranoid.
         try:
             pid, winner, lockrelease = self.__read()
         except ValueError:
-            # the contents of the lock file was corrupted
-            os.unlink(self.__tmpfname)
-            self.__kickstart(force=1)
-            return 0
+            # the contents of the lock file was corrupted, or the global lock
+            # is not a link to our lock.
+            if os.stat(self.__lockfile)[ST_INO] <> os.stat(self.__tmpfname)[ST_INO]:
+                self.__writelog("nlink of 2 but we don't own the lock ? strangeness")
+                os.unlink(self.__tmpfname)
+                self.__write()
+                return 0
+            else:
+                # Okay, we really do have the lock, but the contents is messed up
+                self.__write()
+                return 1
         return pid == os.getpid()
 
     # use with caution!!!
     def steal(self):
         """Explicitly steal the lock.  USE WITH CAUTION!"""
-        self.__write()
-        self.__writelog('explicitly stolen')
+        self.__writelog('explicitly breaking')
+        self.__break()
+        self.lock()
+

--J2SCkAp4GZ/dPZZf--