[Numpy-discussion] NumPy-Discussion Digest, Vol 38, Issue 52

Mon Nov 16 12:19:07 EST 2009

Jake VanderPlas wrote:
> It sounds like all of this could be done very simply without going to
> C, using a class based on numpy.ndarray.  The following works for 1D
> arrays, behaves like a regular 1D numpy array, and could be easily
> improved with a little care.  Is this what you had in mind?
> 
> import numpy
> 
> #easy scalable array class
> class scarray:
>     def __init__(self,*args,**kwargs):
>         self.__data = numpy.ndarray(*args,**kwargs)
> 
>     def append(self,val):
>         tmp = self.__data
>         self.__data = numpy.ndarray(tmp.size+1)
>         self.__data[:-1] = tmp
>         self.__data[-1] = val
>         del tmp
> 
>     def __getattr__(self,attr):
>         return getattr(self.__data,attr)

The problem here is that it's re-allocating memory with every single 
addition of an element. It's pretty common to pre-allocate some extra 
space for this kind of thing (python lists, std vectors, etc, etc, etc), 
so I assumed that it would be performance killer. However, you know what 
they say about premature optimization, so a test or two is in order. 
This is incrementally adding 10000 integers, one point at a time:

Using the suggested code:

In [21]: timeit p.scarray1(10000)
10 loops, best of 3: 1.71 s per loop

Using my accumulator code:

In [23]: timeit p.accum1(10000)
10 loops, best of 3: 25.6 ms per loop

So all that memory re-allocation really does kill performance.

In [24]: timeit p.list1(10000)
100 loops, best of 3: 9.96 ms per loop

But, of course, lists are still faster. I think this is because I'm 
adding python integers, which are already python objects, so that's 
exactly what lists are for -- you can't beat them. This wouldn't apply 
to using them from C, however.

Also, I see a change when we add chunks of data already in a numpy 
array, with .extend():

# adding a sequence of ten integers at a time:
In [40]: timeit profile_accumulator.accum_extend1(10000)
100 loops, best of 3: 6.36 ms per loop

In [41]: timeit profile_accumulator.accum_extend1(10000)
100 loops, best of 3: 6.22 ms per loop

# about the same speed

# but when I add 100 elements at a time:

In [46]: timeit profile_accumulator.list_extend1(10000)
10 loops, best of 3: 56.6 ms per loop

In [47]: timeit profile_accumulator.accum_extend1(10000)
100 loops, best of 3: 13.3 ms per loop

# I start to see a real advantage to the numpy accumulator approach. Is
# that a real use-case? I'm not sure.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov