multiple instance on Unix

Andrew Dalke adalke at mindspring.com
Thu Sep 30 00:58:14 EDT 2004


Nigel King wrote:
> I had a program that was called randomly by specific emails arriving 
> which asked for certain information. If two or more emails arrived 
> simultaneously then procmail asked two or more instances of my program 
> to run. These instances interfered with one another, so I needed a 
> process to stop that from happening.
   ..
> Now, this works but I wondered whether anybody knew of a more standard 
> bit of python code was available for ensuring that only one instance was 
> processing.

What you've just described is pretty standard approach.
You might also look into the fctnl module for lockf/flock.
(I get confused when I need to deal with them.)

The 'open with O_EXCL' solution by Mike Meyer also works.

>     # This program is not thread safe so we must protect it from being
>     # trampled over by another copy

Thread-safe means something different than this.  Multiple
threads run in the same process space and can modify
the same in-memory data structures.

OTOH, you're using the filesystem as the "in-memory data
structure" so it is the moral equivalent.  Another
solution is to make your program "thread safe" so that
you don't have this problem at all.

If your program modifies shared data files then this
of course is not possible.

>     # pause if another email is being processed for half an hour maximum
>     t = time.time()+1800

That's pretty excessive.  Does it often take
that long for a program to finish?

Here's one method for multiple processes to get
along in an ordered manner.

Have a shared directory that everyone can write to.

When a program starts, make
   $dir/lockfile-$timestamp-$pid

This will be unique and in named in time order
except in very rare exceptions that you can ignore.
(system clock goes backwards to correct for time
skew, process id rolls over in the same moment
that multiple requests come in).

You can probably use str(time.time()) for the
timestamp and str(os.getpid()) for the pid.

Once the program creates the file, make a symbol
link from that to the real lock file.  Call that
one simply $dir/lockfile

my_lockfile = ..$dir/lockfile-$timestamp-$pid
system_lockfile = $dir/-ockfile

os.symlink(my_lockfile, system_lockfile)

Ths symlink call will pass or fail atomically,
so no two programs can succeed at the same
time.  This gives you the one-at-a-time
characteristic you're looking for.  (It
may fail on a remote mounted file, but then
again so would the mkdir trick or O_EXCL.)

When finished, delete the two lock files.

If the symlink fails, sleep and try again
later.

To get the ordering you want, if the symlink
fails then scan the directory and look for
a process lockfile which has an earlier
timestamp.  If there is one, sleep for a
bit and do the check again.

If there is a waiting lock file before the
given process then one trick is to find the
one with the timestamp immediately before
the given program.  Then when it wakes up
all it needs to do is check the existance
of that file.  If it doesn't exist then
attempt the symlink-based lock, which should
succeed (since there isn't any earlier
process trying to get the same lock).0

Another trick is to check for programs
that have the lock but have crashed so
won't ever delete them.  A waiting process
can look at the lockfile, follow the
symlink back to the original file, and
get the locking process's pid.

If the process no longer exists (and it's
very unlikely you'll sleep long enough
to cycle through the pids to have a new
process with the same pid) then the next
program in line should feel free to delete
the link and take over the lock.

That's a lot to say.  The code's pretty
simple.  Should be something like (UNTESTED!)

import os

basedir = "lock-directory"
sys_lockfilename = os.path.join(basedir, "lockfile")
my_lockfilename = os.path.join(basedir,
            "-".join("lockfile", str(time.time()), str(os.getpid)))

def get_lock_info(filename):
   terms = filename.split("-")
   if len(terms) != 3 and terms[0] != "lockfile":
     return None
   return float(terms[1]), int(terms[2]))

def get_next_earlier_info(basedir, my_name):
   filedata = []
   for filename in os.listdir(basedir):
     fileinfo = get_lock_info(filename)
     if fileinfo is not None:
       # ordered by time, pid, filename
       filedata.append( fileinfofileinfo + (filename,) )

   # Sort by earliest time
   filedata.sort()

   # Find where this file is, then get the term before it
   my_data = get_lock_info(my_name) + (my_name,)
   i = filedata.find(my_data)
   if i == 0:
     return None
   return filedata[i-1]

def pid_still_running(pid):
   try:
     os.kill(pid, 0)
   except OSError:
     return 0
   return 1

def getlock(my_lockfilename, sys_lockfilename):
   try:
     os.symlink(my_lockfilename, sys_lockfilename)
   except OSError:
     return False
   return True

def main():

   f=open(my_lockfilename, "w")
   f.close()

   try:
     if not getlock(my_lockfilename, sys_lockfilename):
       wait_info = get_next_earlier_info(basedir, my_lockfilename)

       if wait_info:
         wait_pid = wait_info[1]
         while pid_still_running(wait_pid):
           time.sleep(5)

     while not getlock(my_lockfilename, sys_lockfilename):
       time.sleep(5)
       # Could check if the process is still running
       # (use os.readlink() to get the filename pointed to
       # by the system lockfile) but it's tricky because of
       # the rare possibility that two processes are in this
       # loop.  I would need to put more thought into this
       # to figure out the exact ordering.

     try:
       ... do your work here ...
     finally:
       os.remove(sys_lockfilename)


   finally:
     os.remove(my_lockfilename)


				Andrew
				dalke at dalkescientific.com




More information about the Python-list mailing list