Multiple processes, one code

Jp Calderone exarkun at intarweb.us
Fri Feb 7 13:21:06 EST 2003


On Thu, Feb 06, 2003 at 10:27:44PM -0500, Scott Ransom wrote:
> Hi All,
> 
> I am using Python to control a pulsar search code that performs 
> a series of very CPU intensive operations on a list of files 
> (radio data).  The code runs on dual processor nodes of a 
> Beowulf cluster and the jobs are submitted through a batch 
> system.  Each processor of a node executes an identical Python 
> script (via the batch system) and the processes need to 
> coordinate to evenly split up the files to be processed on the 
> node.
> 
> My current solution for this coordination seems like an 
> incredible kludge and I can't help thinking there is a better 
> way.  I currently do it by creating and locking a temporary file 
> something like this:
> 
> [snip]
> 
> One of the problems that I have with this solution is that if 
> something happens to the job(s), the PROC_firsthalf and 
> PROC_lasthalf files are left around.  I therefore have to be 
> very careful about cleaning up after problem jobs have run.


  "Something happens"?  Do you mean, the hardware crashes?  Python crashes? 
Your program raises an unhandled exception?  Someone sends SIGHUP to the
process?  Someone sends SIGKILL to the process?

  If I were writing this, I'd probably want to handle at least some of those
cases slightly differently.  However, if all you're concerned about is
ensuring PROC_firsthalf and PROC_lasthalf disappear, perhaps you can take
advantage of the fact that, on unix, a file deleted from the filesystem
remains available to any process that has it open?  That is, open() the file,
then remove() it, then start processing the data.  This has the side-effect
of removing the need for your locking code (Actually, it introduces a race
condition, but I believe it also removes a race condition from your original
code, so on the whole, it balances out <wink>).

  There are more complex solutions, but they depend on just the behavior
you're looking for.  Maybe the questions at the beginning of my post will be
helpful to organize your thinking on handling things.

  Jp

-- 
There are 10 kinds of people: those who understand binary and those who do
not.
-- 
 up 53 days, 21:50, 7 users, load average: 0.16, 0.25, 0.22
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 196 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-list/attachments/20030207/91a25596/attachment.sig>


More information about the Python-list mailing list