Multiple processes, one code
Scott Ransom
ransom at physics.mcgill.ca
Thu Feb 6 22:27:44 EST 2003
Hi All,
I am using Python to control a pulsar search code that performs
a series of very CPU intensive operations on a list of files
(radio data). The code runs on dual processor nodes of a
Beowulf cluster and the jobs are submitted through a batch
system. Each processor of a node executes an identical Python
script (via the batch system) and the processes need to
coordinate to evenly split up the files to be processed on the
node.
My current solution for this coordination seems like an
incredible kludge and I can't help thinking there is a better
way. I currently do it by creating and locking a temporary file
something like this:
-------
#!/usr/bin/python
from os import remove
from glob import glob
from fcntl import *
files_to_process = glob("*.dat")
# If using 2 CPUs, determine if we will analyze the first
# half of the data files or the second half
flag = open("PROC_firsthalf", "w+")
try: lockf(flag.fileno(), LOCK_EX | LOCK_NB)
except IOError: # Can't get the lock, therefore 2nd
flag.close()
flag = open("PROC_lasthalf", "w+")
firsthalf = 0
myfiles = files[len(files_to_process)/2:]
else: # Got the lock, therefore 1st
firsthalf = 1
myfiles = files[:len(files_to_process)/2]
# Work on each file
for file in myfiles:
....
# Remove the lock and the temporary files
if firsthalf:
lockf(flag.fileno(), LOCK_UN)
flag.close()
remove("PROC_firsthalf")
else:
flag.close()
remove("PROC_lasthalf")
-------------
One of the problems that I have with this solution is that if
something happens to the job(s), the PROC_firsthalf and
PROC_lasthalf files are left around. I therefore have to be
very careful about cleaning up after problem jobs have run.
I would much prefer to do this without a temporary file being
written. Any ideas?
Thanks in advance,
Scott
--
Scott M. Ransom Address: McGill Univ. Physics Dept.
Phone: (514) 398-6492 3600 University St., Rm 338
email: ransom at physics.mcgill.ca Montreal, QC Canada H3A 2T8
GPG print: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989
More information about the Python-list
mailing list