[Numpy-discussion] Numpy arrays shareable among related processes (PR #7533)

Sturla Molden sturla.molden at gmail.com
Tue May 17 08:13:42 EDT 2016


Matěj Týč <matej.tyc at gmail.com> wrote:

>  - Parallel processing of HUGE data, and

This is mainly a Windows problem, as copy-on-write fork() will solve this
on any other platform. I am more in favor of asking  Microsoft to fix their
broken OS. 

Also observe that the usefulness of shared memory is very limited on
Windows, as we in practice never get the same base address in a spawned
process. This prevents sharing data structures with pointers and Python
objects. Anything more complex than an array cannot be shared.

What this means is that shared memory is seldom useful for sharing huge
data, even on Windows. It is only useful for this on Unix/Linux, where base
addresses can stay they same. But on non-Windows platforms, the COW will in
99.99% of the cases be sufficient, thus make shared memory superfluous
anyway. We don't need shared memory to scatter large data on Linux, only
fork.

As I see it. shared memory is mostly useful as a means to construct an
inter-process communication (IPC) protocol. 

Sturla




More information about the NumPy-Discussion mailing list