[Numpy-discussion] fast numpy i/o

Robert Kern robert.kern at gmail.com
Mon Jun 27 12:36:53 EDT 2011


On Mon, Jun 27, 2011 at 11:17, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 21.06.2011, at 8:35PM, Christopher Barker wrote:
>
>> Robert Kern wrote:
>>> https://raw.github.com/numpy/numpy/master/doc/neps/npy-format.txt
>>
>> Just a note. From that doc:
>>
>> """
>>     HDF5 is a complicated format that more or less implements
>>     a hierarchical filesystem-in-a-file.  This fact makes satisfying
>>     some of the Requirements difficult.  To the author's knowledge, as
>>     of this writing, there is no application or library that reads or
>>     writes even a subset of HDF5 files that does not use the canonical
>>     libhdf5 implementation.
>> """
>>
>> I'm pretty sure that the NetcdfJava libs, developed by Unidata, use
>> their own home-grown code. netcdf4 is built on HDF5, so that qualifies
>> as "a library that reads or writes a subset of HDF5 files". Perhaps
>> there are lessons to be learned there. (too bad it's Java)

> Some late comments on the note (I was a bit surprised that HDF5 installation seems to be a serious hurdle to many - maybe I've just been profiting from the fink build system for OS X here - but I also was not aware that the current netCDF is built on downwards-compatibility to the HDF5 standard, something useful learnt again...:-)

It's not so much that it's hard to build for lots of people. Rather,
it would be quite difficult to include into numpy itself, particularly
if we are just relying on distutils. numpy is too "fundamental" of a
package to have extra dependencies.

> Some more confusion arose when finding that the NCAR netCDF includes C and Fortran versions:
> http://www.unidata.ucar.edu/software/netcdf/
> but they also depend actually on HDF5 for netCDF 4 access. While the Java version appears not to, it also only provides *read* access to those formats, so it probably would not be of that much help anyway.

Also good to know!

> The netCDF4-Python package mentioned before
> http://code.google.com/p/netcdf4-python/
> unfortunately builds on HDF5 again, same for the PyNIO module
> http://www.pyngl.ucar.edu/Nio.shtml
> which is probably explained by the above dependencies.
>
> Finally, the former Scientific.IO NetCDF interface is now part of scipy.io, but I assume it only supports netCDF 3 (the documentation is not specific about that). This might be the easiest option for a portable data format (if Matlab supports it).

Yes, it is NetCDF 3.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco



More information about the NumPy-Discussion mailing list