[issue9205] Parent process hanging in multiprocessing if children terminate unexpectedly

Greg Brockman report at bugs.python.org
Wed Jul 14 17:01:20 CEST 2010


Greg Brockman <gdb at ksplice.com> added the comment:

Before I forget, looks like we also need to deal with the result from a worker being un-unpickleable:
"""
#!/usr/bin/env python
import multiprocessing
def foo(x):
  global bar
  def bar(x):
    pass
  return bar
p = multiprocessing.Pool(1)
p.apply(foo, [1])
"""

This shouldn't require much more work, but I'll hold off on submitting a patch until we have a better idea of where we're going in this arena.

> Instead of restarting crashed worker processes it will simply bring down
> the pool, right?
Yep.  Again, as things stand, once you've lost an worker, you've lost a task, and you can't really do much about it.  I guess that depends on your application though... is your use-case such that you can lose a task without it mattering?  If tasks are idempotent, one could have the task handler resubmit them, etc..  But really, thinking about the failure modes I've seen (OOM kills/user-initiated interrupt) I'm not sure under what circumstances I'd like the pool to try to recover.

The idea of recording the mapping of tasks -> workers seems interesting.  Getting all of the corner cases could be hard (e.g. making removing a task from the queue and recording which worker did the removing atomic, detecting if the worker crashed while still holding the queue lock) and doing this would require extra mechanism.  This feature does seem to be useful for pools running many different jobs, because that way a crashed worker need only terminate one job.

Anyway, I'd be curious to know more about the kinds of crashes you've encountered from which you'd like to be able to recover.  Is it just Unpickleable exceptions, or are there others?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9205>
_______________________________________


More information about the Python-bugs-list mailing list