[issue30165] faulthandler acquires lock from signal handler, can deadlock while crashing

Gregory P. Smith report at bugs.python.org
Wed Apr 26 23:12:13 EDT 2017


Gregory P. Smith added the comment:

This report is based on manual code inspection in CPython head after we encountered a deadlock using pytracemalloc on Python 2.7.12 where it _appeared_ to be the scenario I've described.

I see now that I missed noticing the "#ifndef Py_HAVE_NATIVE_TLS" within thread.c which should imply a different PyThread_get_key_value() implementation that likely does not use our lock acquiring fallback find_key().  So my code analysis may not make sense...

To give a taste of the large process setup we saw it in:

* A CPython process that has extension modules which create their own threads as well as potentially Python having created its own threads.
* We've called faulthandler.enable() *and* faulthandler.register(SIGTERM).
* We send a SIGTERM to processes in this environment which are taking too long to complete; the goal was to get a stack trace of what the process was potentially stuck doing when our external timeout monitoring mechanism kicked in.

our stuck process had received the SIGTERM and analyzing it revealed a deadlock between two threads which appeared to involve this faulthandler path.

let me gather more info into one place.  from there i should be able to come up with a way to reproduce it (or even better, not)

----------
assignee: haypo -> gregory.p.smith

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30165>
_______________________________________


More information about the Python-bugs-list mailing list