os.wait() losing child?

Hrvoje Niksic hniksic at xemacs.org
Thu Jul 12 08:32:18 EDT 2007


Jason Zheng <Xin.Zheng at jpl.nasa.gov> writes:

> greg wrote:
>> Jason Zheng wrote:
>>> Hate to reply to my own thread, but this is the working program
>>> that can demonstrate what I posted earlier:
>> I've figured out what's going on. The Popen class has a
>> __del__ method which does a non-blocking wait of its own.
>> So you need to keep the Popen instance for each subprocess
>> alive until your wait call has cleaned it up.
>> The following version seems to work okay.
>>
> It still doesn't work on my machine. I took a closer look at the Popen
> class, and I think the problem is that the __init__ method always
> calls a method _cleanup, which polls every existing Popen
> instance.

Actually, it's not that bad.  _cleanup only polls the instances that
are no longer referenced by user code, but still running.  If you hang
on to Popen instances, they won't be added to _active, and __init__
won't reap them (_active is only populated from Popen.__del__).

This version is a trivial modification of your code to that effect.
Does it work for you?

#!/usr/bin/python

import os
from subprocess import Popen

pids = {}
counts = [0,0,0]

for i in xrange(3):
   p = Popen('sleep 1', shell=True, cwd='/home', stdout=file(os.devnull,'w'))
   pids[p.pid] = p, i
   print "Starting child process %d (%d)" % (i,p.pid)

while (True):
   pid, ignored = os.wait()
   try:
      p, i = pids[pid]
   except KeyError:
      # not one of ours
      continue
   del pids[pid]
   counts[i] += 1

   #terminate if count>10
   if (counts[i]==10):
     print "Child Process %d terminated." % i
     if reduce(lambda x,y: x and (y>=10), counts):
       break
     continue

   print "Child Process %d terminated, restarting" % i
   p = Popen('sleep 1', shell=True, cwd='/home', stdout=file(os.devnull,'w'))
   pids[p.pid] = p, i



More information about the Python-list mailing list