[Python-ideas] solving multi-core Python
Sturla Molden
sturla.molden at gmail.com
Wed Jun 24 23:41:02 CEST 2015
On 24/06/15 22:50, M.-A. Lemburg wrote:
> The tricky part is managing pointers in those data structures,
> e.g. a container types for other Python objects will have to
> store all referenced objects in the shared memory segment as
> well.
If a container type for Python objects contains some unknown object type
we would have to use pickle as fallback.
> For NumPy arrays using simple types this is a lot easier,
> since you don't have to deal with pointers to other objects.
The objects we deal with in scientific computing are usually arrays with
a rather regular structure, not deeply nested Python objects. Even a
more complex object like scipy.spatial.cKDTree is just a collection of a
few contiguous arrays under the hood. So we could for most parts squash
the pickle overhead that anyone will encounter by specializing a queue
that has knowledge about a small set of Python types.
> When saying "passing a lot of binary data over a pipe" you mean
> the meta-data ?
No, I mean the buffer pointed to by PyArray_DATA(obj) when using the
NumPy C API. We have to send a lot of raw bytes over an IPC mechanism
before this communication compares to the pickle overhead.
Sturla
More information about the Python-ideas
mailing list