[SciPy-User] Maximum file size for .npz format?

Gökhan Sever gokhansever at gmail.com
Fri Mar 12 12:29:58 EST 2010


On Fri, Mar 12, 2010 at 11:22 AM, Paul Anton Letnes <
paul.anton.letnes at gmail.com> wrote:

>
> On 11. mars 2010, at 23.50, Lafras Uys wrote:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> >>> I need to save a fairly large set of arrays to disk. I have saved it
> using
> >>> numpy.savez, and the resulting file is around 11Gb (yes, I did say
> fairly
> >>> large ;D). When I try to load it using numpy.load, the zipfile module
> >>> compains about
> >>> BadZipfile: Bad magic number for file header
> >>>
> >>> I can't open it with the normal zip utility present on the system, but
> it
> >>> could be that it's barfing about files being larger than 2Gb.
> >>> Is there some file limit for npzs?
> >>
> >> Yes, the ZIP file format has a 4GB limit. Unfortunately, Python does
> >> not yet support the ZIP64 format.
> >>
> >>> Is there anyway I can recover the data (I
> >>> guess I could try decompressing the file with 7z and extracting the
> >>> individual npy files?)
> >>
> >> Possibly. However, if the normal zip utility isn't working, 7z
> >> probably won't, either. Worth a try, though.
> >
> > I've had similar problems, my solution was to move to HDF5. There are
> > two options for accessing and working with HDF files from python: h5py
> > (http://code.google.com/p/h5py/) and pytables
> > (http://www.pytables.org/). Both packages have built in numpy support.
> >
> > Regards,
> > Lafras
>
> I've experienced similar issues too, but I moved to NetCDF. The only
> disadvantage was that I did not find any python modules that work well _and_
> support numpy. Hence, I am considering moving to HDF5. Which python module
> would people here recommend? (Or, alternatively, did I miss a great netCDF
> python module that someone could tell me about?)
>
> Cheers,
> Paul.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>

There is http://code.google.com/p/netcdf4-python/

I know netcdf4 is a subset of HDF5. What advantages there to use HDF5 not
NetCDF4 ?


-- 
Gökhan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20100312/2dd4c3ca/attachment.html>


More information about the SciPy-User mailing list