[SciPy-User] Numpy pickle format
Francesc Alted
faltet at pytables.org
Mon Nov 29 08:46:04 EST 2010
Hi David,
A Thursday 25 November 2010 00:22:02 David Baddeley escrigué:
> Thanks heaps for the detailed reply! That looks like it should be
> enough info to get me started ... I know it's a bit of a niche
> application, but is there likely to be anyone else out there who's
> likely to be interested in similar functionality? Just want to know
> if it's worth taking the time to think about supporting some of the
> additional aspects of the protocol (eg c/fortran order) before I
> cobble something together - I wonder if one could wrap JAMA to
> provide some very basic array functionality ...
I'm interested. I'm after adopting a protocol to send arrays in a way
that can serialize/deserialize them without having to duplicate the
contents in memory (so that the serialized version and the deserialized
one does not have to happen at the same time)..
My idea is to adopt something similar to the native NPY format for
files:
http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt
but adapting it to support blocking --that is, to be able to send parts
of the array by blocks, and be able to restore the original array by
assembling these blocks. That way, the serialized and deserialized do
not have to coexist in the same process memory (only one block has) when
sending the stream to destination. As a plus, this would add the
possibility to compress blocks transparently, and with a little bit of
more effort, perhaps even allowing random access in case the
serialization goes to a file on-disk (and not to a stream).
I'm thinking in supporting just the metadata that NPY supports right
now, that is, the dtype, the C/Fortran order and the shape, that's all.
After this format would be clear, then several implementations can be
done (like Pyro or zeromq, or just by using something in the Python
standard library).
Do you think that this approach would fulfill your requirements?
--
Francesc Alted
More information about the SciPy-User
mailing list