[Numpy-discussion] Assignment from a list is slow in Numarray

Timo Korvola tkorvola at e.math.helsinki.fi
Mon Sep 20 06:17:01 EDT 2004


Francesc Alted <falted at pytables.org> writes:
> At the beginning, you will need to export your data to a PyTables
> file,

... which appears to be actually a HDF5 file.  Thanks for the tip.  It
is clear that a binary file format would be more advantageous
simply because text files are not seekable in the way needed for
parallel reading.  I was thinking of using NetCDF because OpenDX does
not support HDF5.  Konrad Hinsen has written a Python interface for reading
NetCDF files.  Distributed writing is more compilcated and
unfortunately this interface seems particularly unsuitable for it
because the difference between definition and data mode is hidden.
The interface also uses Numeric instead of Numarray.

An advantage of HDF5 would be that the libraries support parallel I/O
via MPI-IO but can this be utilised in PyTables?  There is the problem
that there are no standard MPI bindings for Python.

I have also considered writing Python bindings for Parallel-NetCDF but
I suppose that would not be totally trivial even if the library turns
out to be well Swiggable.

-- 
	Timo Korvola		<URL:http://www.iki.fi/tkorvola>




More information about the NumPy-Discussion mailing list