[Matrix-SIG] Binary file I/O

Christos Siopis siopis@astro.ufl.edu
Sat, 13 Mar 1999 20:47:09 -0500 (EST)


OK, here is the last question: One of the first things I needed
with NumPy was binary file I/O. I found Travis' numpyio module,
but like he says, some of numpyio's functionality is already
supported by Python itself, with the disadvantage that a copy of
the array has to be made (e.g., using fromstring() and tostring()).
Is this also the case when one uses the built-in array module instead?

For example, to read 10 float elements from an open binary file
object f:

>>> import Numeric, array
>>> 
>>> a=array.array('f') 
>>> a.fromfile(f, 10)
>>> b = Numeric.array(a, copy=0)

Will the last line create a new copy of the array in memory much
like it would create a new copy from a string? Or is it that the
array.array and Numeric.array objects are "related enough" that
no copy has to be made?

In any case, I was wondering if there is any thought of adding
fromfile() and tofile() methods for Numeric arrays (perhaps
based on Travis' numpyio?). I saw this mentioned in James
Hugunin's very first email posting to this list, but somewhere
along the way it must have been dropped. In David Ascher's 1996
NumPy tutorial, it is also suggested using pickle, but this is
again wasteful for large arrays and also the resulting files
cannot be read from inside other programs.

I also have a minor suggestion for numpyio, which would also
apply to fromfile()/tofile() if they would ever to be implemented.
Why not make the methods more NumPy-friendly by specifying the
array to be read or writen using a shape tuple instead of using
C library's "number of elements" and "size of element" arguments?
I mean something like:

from Numeric import *

def rbf(f, shape = (-1,), typecode = 'f'):
    return reshape(fromstring(f.read(product(shape)), typecode), shape)

where rbf = "read binary file", and f is an open file object. The
effect of the above is:

1. If the shape tuple contains no negative numbers, then
   nelem = product(shape) elements of requested typecode are read
   and an array of the appropriate shape is returned.

2. If the shape tuple contains a negative number, then the entire
   binary file is read and returned in the requested shape. The
   file's size would have to be a multiple of the product of the
   shape tuple's nonegative elements or else rehaps() issues an
   exception.

Thanks,
Christos