Multiprocessing, shared memory vs. pickled copies

sturlamolden sturlamolden at yahoo.no
Thu Apr 7 20:03:01 EDT 2011


On 4 apr, 22:20, John Ladasky <lada... at my-deja.com> wrote:

> https://bitbucket.org/cleemesser/numpy-sharedmem/src/3fa526d11578/shm...
>
> I've added a few lines to this code which allows subclassing the
> shared memory array, which I need (because my neural net objects are
> more than just the array, they also contain meta-data).  But I've run
> into some trouble doing the actual sharing part.  The shmarray class
> CANNOT be pickled.

That is hilarious :-)

I see that the bitbucket page has my and Gaëls name on it, but the
code is changed and broken beyond repair! I don't want to be
associated with that crap!

Their "shmarray.py" will not work -- ever. It fails in two ways:

1. multiprocessing.Array cannot be pickled (as you noticed). It is
shared by handle inheritance. Thus we (that is Gaël and I) made a
shmem buffer object that could be pickled by giving it a name in the
file system, instead of sharing it anonymously by inheriting the
handle. Obviously those behind the bitbucket page don't understand the
difference between named and anonymous shared memory (that is, System
V IPC and BSD mmap, respectively.)

2. By subclassing numpy.ndarray a pickle dump would encode a copy of
the buffer. But that is what we want to avoid! We want to share the
buffer itself, not make a copy of it! So we changed how numpy pickles
arrays pointing to shared memory, instead of subclassing ndarray. I
did that by slightly modifying some code written by Robert Kern.

http://folk.uio.no/sturlamo/python/sharedmem-feb13-2009.zip
Known issues/bugs: 64-bit support is lacking, and os._exit in
multiprocessing causes a memory leak on Linux.


Sturla



More information about the Python-list mailing list