[PYTHON MATRIX-SIG] Re: Saving HUGE arrays

Guido van Rossum guido@CNRI.Reston.Va.US
Thu, 21 Nov 1996 01:32:15 -0500


(Crossposting from the matrix-sig to the Pythmain list.)

>  I am working with huge arrays (size comparable to the total swap
> space of the machine, ie ~ 100Mo).
>  Currently, we write theses arrays to disk using the following
> procedure:
>     fp.write( some_header_info )
>     fp.write( arr.tostring() )
> or in some case:
>     fp.write( arr.byteswapped().tostring() )
> 
> This works but duplicates the array data to get a string.
> 
> I know Python strings are immutables, so sharing data between a string
> and an array is prohibited.
> Should I write a C simple function
> 
>     write_array( arr, fp, byte_order )
> 
> directly writing the data to a file, swapping bytes if requested,
> 
> or is there a more standard solution ?

Hmm, interesting.  Jack Jansen just mailed me an idea for a standard
way to access "buffer-like" Python objects as linear sequences of
bytes in Python, especially for I/O purposes.  This should work for
strings, arrays from the old array module, and numerical arrays, and
other extension types that have an internal representation that is a
contiguous sequence of bytes.  (Jack's motivation was that it would
make extensions for manipulating sound data easier to write; it seems
the same problem as stated here.)

Does the matrix community think this would be useful?  Any thoughts on
the form of the interface?

I would presume the interface could be something like

int PyBuffer_GetInfo(object *o, const char **address, int *length);

to get the address and size of a "buffer" object into a C char pointer
(or should that be void???) and length.  The return value would be -1
if the object was not "bufferable".  I guess a PyBuffer_Check(o) call
should also be available.  For *mutable* buffer objects, a version
that returns a non-const pointer could be provided (but for strings
this would fail) so one could read data *into* a buffer object.

--Guido van Rossum (home page: http://www.python.org/~guido/)

=================
MATRIX-SIG  - SIG on Matrix Math for Python

send messages to: matrix-sig@python.org
administrivia to: matrix-sig-request@python.org
=================