[Numpy-discussion] Is there a pure numpy recipe for this?

Aaron O'Leary aaron.oleary at gmail.com
Thu Mar 27 12:19:54 EDT 2014


You might want to look at hdf5 if you're routinely running out of ram.
I'm using h5py with multi gigabyte data on an ssd right now. It is very
fast. You still have to be careful with your computations and try to
avoid creating copies though.

hypy: www.h5py.org

aaron

On Thu 27 Mar, RayS wrote:
> I find this interesting, since I work with medical data sets of 100s 
> of MB, and regularly run into memory allocation problems when doing a 
> lot of Fourrier analysis, waterfalls etc. The per-process limit seems 
> to be about 1.3GB on this 6GB quad-i7 with Win7. For live data 
> collection routines I simply creates zeros() of say 300MB and trim 
> the array when saving to disk. memmaps are also limited to RAM, and 
> take a looooong time to create (seconds). So, I've been investigating 
> Pandas and segmentaxis - just a bit so far.
> 
> - Ray Schumacher
> 
> 
> At 12:02 AM 3/27/2014, you wrote:
> >Chris Barker - NOAA Federal wrote
> > > note that  numpy arrays are not re-sizable, so np.append() and np.insert()
> > > have to make a new array, and copy all the old data over. If you are
> > > appending one at a time, this can be pretty darn slow.
> > >
> > > I wrote a "grow_array" class once, it was a wrapper around a numpy array
> > > that pre-allocated extra data to make appending more efficient. It's kind
> > > of half-baked code now, but let me know if you are interested.
> >
> >Hi Chris,
> >
> >Yes, it is a good point and I am aware of it. For some of these functions it
> >would have been nice if i could have parsed a preallocated, properly sliced
> >array to the functions, which i could then reuse in each iteration step.
> >
> >It is indeed the memory allocation which appear to take more time than the
> >actual calculations.
> >
> >Still it is much faster to create a few arrays than to loop through a
> >thousand individual elements in pure Python.
> >
> >Interesting with the grow_array class. I think that what I have for now is
> >sufficient, but i will keep your offer in mind:)
> >
> >--Slaunger
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list