[SciPy-User] Maximum file size for .npz format?

Paul Anton Letnes paul.anton.letnes at gmail.com
Fri Mar 12 12:22:55 EST 2010


On 11. mars 2010, at 23.50, Lafras Uys wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
>>> I need to save a fairly large set of arrays to disk. I have saved it using
>>> numpy.savez, and the resulting file is around 11Gb (yes, I did say fairly
>>> large ;D). When I try to load it using numpy.load, the zipfile module
>>> compains about
>>> BadZipfile: Bad magic number for file header
>>> 
>>> I can't open it with the normal zip utility present on the system, but it
>>> could be that it's barfing about files being larger than 2Gb.
>>> Is there some file limit for npzs?
>> 
>> Yes, the ZIP file format has a 4GB limit. Unfortunately, Python does
>> not yet support the ZIP64 format.
>> 
>>> Is there anyway I can recover the data (I
>>> guess I could try decompressing the file with 7z and extracting the
>>> individual npy files?)
>> 
>> Possibly. However, if the normal zip utility isn't working, 7z
>> probably won't, either. Worth a try, though.
> 
> I've had similar problems, my solution was to move to HDF5. There are
> two options for accessing and working with HDF files from python: h5py
> (http://code.google.com/p/h5py/) and pytables
> (http://www.pytables.org/). Both packages have built in numpy support.
> 
> Regards,
> Lafras

I've experienced similar issues too, but I moved to NetCDF. The only disadvantage was that I did not find any python modules that work well _and_ support numpy. Hence, I am considering moving to HDF5. Which python module would people here recommend? (Or, alternatively, did I miss a great netCDF python module that someone could tell me about?)

Cheers,
Paul.


More information about the SciPy-User mailing list