[issue35267] reproducible deadlock with multiprocessing.Pool

dzhu report at bugs.python.org
Fri Nov 16 20:06:39 EST 2018


New submission from dzhu <bpo.afbo at dfgh.net>:

The attached snippet causes a deadlock just about every time it's run (tested with 3.6.7/Ubuntu, 3.7.1/Arch, 3.6.7/OSX, and 3.7.1/OSX -- deadlock seems to be less frequent on the last, but still common). The issue appears to be something like the following sequence of events:

1. The main thread calls pool.__exit__, eventually entering Pool._terminate_pool.
2. result_handler's state is set to TERMINATE, causing it to stop reading from outqueue.
3. The main thread, in _terminate_pool, joins on worker_handler, which is (usually) in the middle of sleeping for 0.1 seconds, opening a window for the next two steps to occur.
4. The worker process finishes its task and acquires the shared outqueue._wlock.
5. The worker attempts to put the result into outqueue, but its pickled form is too big to fit into the buffer of os.pipe, and it blocks here with the lock held.
6. worker_handler wakes up and exits, freeing _terminate_pool to continue.
7. _terminate_pool terminates the worker.
8. task_handler tries to put None into outqueue, but blocks, since the lock was acquired by the terminated worker.
9. _terminate_pool joins on task_handler, and everything is deadlocked.

----------
components: Library (Lib)
files: lock.py
messages: 330017
nosy: dzhu
priority: normal
severity: normal
status: open
title: reproducible deadlock with multiprocessing.Pool
type: behavior
versions: Python 3.6, Python 3.7
Added file: https://bugs.python.org/file47937/lock.py

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35267>
_______________________________________


More information about the Python-bugs-list mailing list