[SciPy-user] Reading in data as arrays, quickly and easily?

Francesc Alted falted at pytables.org
Mon Jul 12 08:57:35 EDT 2004


A Dissabte 10 Juliol 2004 19:29, Eric Jonas va escriure:
> Well, I had been focusing on numarray, because everything I read seems
> to suggest that it's the wave of the future, although at the same time
> no one really seems to be using it much yet. May I ask how much larger
> than 1 GB?  I'm dealing with between 1-20 GB EEG files, and for some
> reason I don't thinK I'll be able to afford 64-bit hardware in the near
> future : ) 

If you need to deal with full 64-bit file address space (even on 32-bit
processors), maybe you want to check PyTables [1]. It deals with both
homogeneous and heterogenerous datasets. In addition, it supports
byteordering automatically, so that you can write data in big-endian
machines and read them in low-endian ones without problems, as you seems to
need.

> What I really want is to read in some fairly complex records, do endian
> swapping, alignment, etc. all in C. I'm mostly interested in spectral
> analysis, so the hope was that I'd be able to read in 32kB chunks at a
> time for my periodograms. 

If the data is generated by programs made in C, you can still save it in
HDF5 format [2] (the format that is used by PyTables) using its C [3] API
and read it afterwards with PyTables. There is even a high level API [4] to
create HDF5 files in C in an easier way (but still compatible with
PyTables).

> Also, I looked through the numarray docs again, and still couldn't find
> anything about memory mapping -- any pointers? What command(s) have you
> been using to pull this off? 

Memory mapping features in numarray are mainly documented as doc strings in
sources (see numarray/memmap.py). Also, you may find this thread [5]
interesting.

[1] http://www.pytables.org
[2] http://hdf.ncsa.uiuc.edu/HDF5/
[3] http://hdf.ncsa.uiuc.edu/HDF5/doc/UG/
[4] http://hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/
[5] http://aspn.activestate.com/ASPN/Mail/Message/numpy-discussion/1895405

Cheers,

-- 
Francesc Alted




More information about the SciPy-User mailing list