[issue39360] python3.8 regression - ThreadPool join via __del__ hangs forever

STINNER Victor report at bugs.python.org
Fri Mar 20 13:42:46 EDT 2020


STINNER Victor <vstinner at python.org> added the comment:

> Victor, are you OK if we backport both changes to 3.8?

Let me look at commit 9ad58acbe8b90b4d0f2d2e139e38bb5aa32b7fb6:
"bpo-19466: Py_Finalize() clears daemon threads earlier (GH-18848)"

Calling _PyThreadState_DeleteExcept() in Py_FinalizeEx() is really dangerous. It frees PyThreadState memory of daemon threads. Daemon threads continue to run while Py_FinalizeEx() is running (which takes an unknown amount of time, we only know that it's larger than 0 seconds). When a daemon thread attempts to acquire the GIL, it will likely crash if its PyThreadState memory is freed. This memory can be overriden by another memory allocation, or dereferencing the pointer can trigger a segmentation fault.

This change caused multiple regressions in the master branch. I had hard time to fix all crashes: I modified take_gil() 4 times, and I'm still not sure that my fix is correct. I had to modify take_gil() function which acquire the GIL: this function is really fragile and I would prefer to not touch it in a stable branch. See bpo-39877 changes to have an idea of the complexity of the problem.

Python finalization is really fragile: https://pythondev.readthedocs.io/finalization.html

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39360>
_______________________________________


More information about the Python-bugs-list mailing list