[SciPy-User] Numpy pickle format

Francesc Alted faltet at pytables.org
Mon Nov 29 08:46:04 EST 2010


Hi David,

A Thursday 25 November 2010 00:22:02 David Baddeley escrigué:
> Thanks heaps for the detailed reply! That looks like it should be
> enough info to get me started ... I know it's a bit of a niche
> application, but is there likely to be anyone else out there who's
> likely to be interested in similar functionality? Just want to know
> if it's worth taking the time to think about supporting some of the
> additional aspects of the protocol (eg c/fortran order) before I
> cobble something together -  I wonder if one could wrap JAMA to
> provide some very basic array functionality ...

I'm interested.  I'm after adopting a protocol to send arrays in a way 
that can serialize/deserialize them without having to duplicate the 
contents in memory (so that the serialized version and the deserialized 
one does not have to happen at the same time)..

My idea is to adopt something similar to the native NPY format for 
files:

http://svn.scipy.org/svn/numpy/trunk/doc/neps/npy-format.txt

but adapting it to support blocking --that is, to be able to send parts 
of the array by blocks, and be able to restore the original array by 
assembling these blocks.  That way, the serialized and deserialized do 
not have to coexist in the same process memory (only one block has) when 
sending the stream to destination.  As a plus, this would add the 
possibility to compress blocks transparently, and with a little bit of 
more effort, perhaps even allowing random access in case the 
serialization goes to a file on-disk (and not to a stream).

I'm thinking in supporting just the metadata that NPY supports right 
now, that is, the dtype, the C/Fortran order and the shape, that's all.  
After this format would be clear, then several implementations can be 
done (like Pyro or zeromq, or just by using something in the Python 
standard library).

Do you think that this approach would fulfill your requirements?

-- 
Francesc Alted



More information about the SciPy-User mailing list