[Python-ideas] solving multi-core Python

Sturla Molden sturla.molden at gmail.com
Wed Jun 24 23:41:02 CEST 2015


On 24/06/15 22:50, M.-A. Lemburg wrote:

> The tricky part is managing pointers in those data structures,
> e.g. a container types for other Python objects will have to
> store all referenced objects in the shared memory segment as
> well.

If a container type for Python objects contains some unknown object type 
we would have to use pickle as fallback.


> For NumPy arrays using simple types this is a lot easier,
> since you don't have to deal with pointers to other objects.

The objects we deal with in scientific computing are usually arrays with 
a rather regular structure, not deeply nested Python objects. Even a 
more complex object like scipy.spatial.cKDTree is just a collection of a 
few contiguous arrays under the hood. So we could for most parts squash 
the pickle overhead that anyone will encounter by specializing a queue 
that has knowledge about a small set of Python types.


> When saying "passing a lot of binary data over a pipe" you mean
> the meta-data ?

No, I mean the buffer pointed to by PyArray_DATA(obj) when using the 
NumPy C API. We have to send a lot of raw bytes over an IPC mechanism 
before this communication compares to the pickle overhead.


Sturla













More information about the Python-ideas mailing list