[New-bugs-announce] [issue17969] multiprocessing crash on exit
Kristján Valur Jónsson
report at bugs.python.org
Mon May 13 13:31:34 CEST 2013
New submission from Kristján Valur Jónsson:
We have observed this crash with some frequency when running our compilation scripts using multiprocessing.Pool()
By analysing the crashes, this is what is happening:
1) The Pool has a "daemon" thread managing the pool.
2) the worker is asleep, waiting for the GIL
3) The main thread exits. The system starts its shutdown. During PyInterpreterState_Clear, it has cleared among other things the sys dict. During this, it clears an old traceback. The traceback contains a multiprocessing.connection object.
4) The connection object is cleared. It it contains this code:
Py_BEGIN_ALLOW_THREADS
CLOSE(self->handle);
Py_END_ALLOW_THREADS
5) The sleeping daemon thread is woken up and starts prancing around. Upon calling sys.exc_clear() it crashes, since the tstate->interp->sysdict == NULL.
I have a workaround in place in our codebase:
static void
connection_dealloc(ConnectionObject* self)
{
if (self->weakreflist != NULL)
PyObject_ClearWeakRefs((PyObject*)self);
if (self->handle != INVALID_HANDLE_VALUE) {
/* CCP Change. Cannot release threads here, because this
* deallocation may be running during process shutdown, and
* releaseing a daemon thread will cause a crash
Py_BEGIN_ALLOW_THREADS
CLOSE(self->handle);
Py_END_ALLOW_THREADS
*/
CLOSE(self->handle);
}
PyObject_Del(self);
}
In general, deallocators should have no side effects, I think. Releaseing the GIL is certainly a side effect.
I realize that process shutdown is a delicate matter. One delicate thing is that we cannot allow worker threads to run anymore. I see no general mechanism for ensuring this, but surely at least not releasing the GIL for deallocators is a first step?
----------
messages: 189123
nosy: kristjan.jonsson
priority: normal
severity: normal
status: open
title: multiprocessing crash on exit
type: crash
versions: Python 2.7
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17969>
_______________________________________
More information about the New-bugs-announce
mailing list