[issue45304] Supporting out-of-band buffers (pickle protocol 5) in multiprocessing

jakirkham report at bugs.python.org
Mon Sep 27 15:04:19 EDT 2021


New submission from jakirkham <jakirkham at gmail.com>:

In Python 3.8+, pickle protocol 5 ( PEP<574> ) was added, which supports out-of-band buffer collection[1]. The idea being that when pickling an object with a large amount of data attached to it (like an array, dataframe, etc.) one could collect this large amount of data alongside the normal pickled data without causing a copy. This is important in particular when serializing data for communication between two python instances. IOW this is quite valuable when using a `multiprocessing.pool.Pool`[2] or a `concurrent.futures.ProcessPoolExecutor`[3]. However AFAICT neither of these leverage this functionality[4][5]. To ensure zero-copy processing of large data, it would be helpful for pickle protocol 5 to be used in both of these pools.


[1] https://docs.python.org/3/library/pickle.html#pickle-oob
[2] https://docs.python.org/3/library/multiprocessing.html#multiprocessing.pool.Pool
[3] https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.ProcessPoolExecutor
[4] https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L372
[5] https://github.com/python/cpython/blob/16b5bc68964c6126845f4cdd54b24996e71ae0ba/Lib/multiprocessing/queues.py#L245

----------
components: IO, Library (Lib)
messages: 402736
nosy: jakirkham
priority: normal
severity: normal
status: open
title: Supporting out-of-band buffers (pickle protocol 5) in multiprocessing
type: performance
versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue45304>
_______________________________________


More information about the Python-bugs-list mailing list