[issue40379] multiprocessing's default start method of fork()-without-exec() is broken

Michał Górny report at bugs.python.org
Fri Feb 11 11:13:53 EST 2022


Michał Górny <mgorny at gentoo.org> added the comment:

After updating PyPy3 to use Python 3.9's stdlib, we hit very bad hangs because of this — literally compiling a single file with "parallel" compileall could hang.  In the end, we had to revert the change in how Python 3.9 starts workers because otherwise multiprocessing would be impossible to use:

https://foss.heptapod.net/pypy/pypy/-/commit/c594b6c48a48386e8ac1f3f52d4b82f9c3e34784

This is a very bad default and what's even worse is that it often causes deadlocks that are hard to reproduce or debug.  Furthermore, since "fork" is the default, people are unintentionally relying on its support for passing non-pickleable projects and are creating non-portable code.  The code often becomes complex and hard to change before they discover the problem.

Before we managed to figure out how to workaround the deadlocks in PyPy3, we were experimenting with switching the default to "spawn".  Unfortunately, we've hit multiple projects that didn't work with this method, precisely because of pickling problems.  Furthermore, they were surprised to learn that their code wouldn't work on macOS (in the end, many people perceive Python as a language for writing portable software).

Finally, back in 2018 I've made one of my projects do parallel work using multiprocessing.  It gave its users great speedup but for some it caused deadlocks that I couldn't reproduce nor debug.  In the end, I had to revert it.  Now that I've learned about this problem, I'm wondering if this wasn't precisely because of "fork" method.

----------
nosy: +mgorny

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue40379>
_______________________________________


More information about the Python-bugs-list mailing list