multiprocessing signal defect

Adam Tauno Williams awilliam at whitemice.org
Fri Oct 29 10:08:01 EDT 2010


On Fri, 2010-10-29 at 08:39 -0400, Neal Becker wrote: 
> Adam Tauno Williams wrote:
> 
> > On Fri, 2010-10-29 at 08:12 -0400, Neal Becker wrote:
> >> Seems multiprocessing doesn't behave well with signals:
> >> ---------
> >> from multiprocessing import Pool
> >> import time
> >> def sleep (dummy):
> >>     time.sleep (10)
> >> if __name__ == '__main__':
> >>     pool = Pool (processes=2)
> >>     result = pool.map (sleep, range (4))
> >> -------------
> >> start it up
> >> $ python test_multip.py
> >> ----------------------
> >> ps auxf | grep python
> >> nbecker   6605  1.6  0.1 338192  6952 pts/1    Sl+  08:03   0:00  |      
> >> \_ python test_multip.py
> >> nbecker   6606  0.0  0.1 186368  4760 pts/1    S+   08:03   0:00  |
> >> \_ python test_multip.py
> >> nbecker   6607  0.0  0.1 186372  4740 pts/1    S+   08:03   0:00  |
> >> \_ python test_multip.py
> >> kill 6607
> >>  ps auxf | grep python
> >> nbecker   6605  0.5  0.1 338192  6952 pts/1    Sl+  08:03   0:00  |      
> >> \_ python test_multip.py
> >> nbecker   6606  0.0  0.1 186368  4760 pts/1    S+   08:03   0:00  |
> >> \_ python test_multip.py
> >> nbecker   6607  0.0  0.0      0     0 pts/1    Z+   08:03   0:00  |
> >> \_ [python] <defunct>
> >>  kill 6606
> >> ps auxf | grep python
> >> nbecker   6605  0.3  0.1 338192  6952 pts/1    Sl+  08:03   0:00  |      
> >> \_ python test_multip.py
> >> nbecker   6606  0.0  0.0      0     0 pts/1    Z+   08:03   0:00  |
> >> \_ [python] <defunct>
> >> nbecker   6607  0.0  0.0      0     0 pts/1    Z+   08:03   0:00  |
> >> \_ [python] <defunct>
> >> Now we have 2 dead children and the parent is hung forever.
> >> Isn't this a serious defect?
> > No, I think this is just POSIX/UNIX process behavior.  If the parent
> > never joins on the child the child can never exit [which is what a
> > Zombie process is].
> > For example, see the do_verify_workers method in
> <http://coils.hg.sourceforge.net/hgweb/coils/coils/file/6ab5ade3e488/src/coils/logic/workflow/services/executor.py>
> > A parent process needs to make some effort to reap its children.
> Yes, and isn't this a defect in mulitprocessing module that the parent 
> process does not reap its children in response to signals like show above?

No, I don't think so.  You're asking the module to over generalize
behavior.  Reaping of the child is important, and that the child needs
to be reaped may matter to the master child (why? did something go
wrong?).  Silently reaping them [which would reduce the size of the
Pool? Or would it dynamically create a new worker?] might have
unintended side effects.  Maybe since Pool specifically generalizes
child management you could make an argument it should reap, but I'm not
sure.  Personally I'd recommend that your worker processes include a
signal handler to do something smart in the case of a "-15" [for which
there isn't really a thread equivalent - can you sent a SystemV style
signal to an individual thread in a process?  I don't think so.]

How would a 'traditional' thread pool behave if a thread abended?  [of
course, that depends on the thread-pool implementation]  The correct
behavior in case of an exception in a thread is a topic of some debate.




More information about the Python-list mailing list