[Numpy-discussion] reading gzip compressed files using numpy.fromfile
Peter Schmidtke
pschmidtke at mmb.pcb.ub.es
Wed Oct 28 15:31:43 EDT 2009
Dear Numpy Mailing List Readers,
I have a quite simple problem, for what I did not find a solution for now.
I have a gzipped file lying around that has some numbers stored in it and I
want to read them into a numpy array as fast as possible but only a bunch
of data at a time.
So I would like to use numpys fromfile funtion.
For now I have somehow the following code :
f=gzip.open( "myfile.gz", "r" )
xyz=npy.fromfile(f,dtype="float32",count=400)
So I would read 400 entries from the file, keep it open, process my data,
come back and read the next 400 entries. If I do this, numpy is complaining
that the file handle f is not a normal file handle :
OError: first argument must be an open file
but in fact it is a zlib file handle. But gzip gives access to the normal
filehandle through f.fileobj.
So I tried xyz=npy.fromfile(f.fileobj,dtype="float32",count=400)
But there I get just meaningless values (not the actual data) and when I
specify the sep=" " argument for npy.fromfile I get just .1 and nothing
else.
Can you tell me why and how to fix this problem? I know that I could read
everything to memory, but these files are rather big, so I simply have to
avoid this.
Thanks in advance.
--
Peter Schmidtke
----------------------
PhD Student at the Molecular Modeling and Bioinformatics Group
Dep. Physical Chemistry
Faculty of Pharmacy
University of Barcelona
More information about the NumPy-Discussion
mailing list