[Numpy-discussion] faster code

Sun May 16 18:29:21 EDT 2010

On Sun, May 16, 2010 at 1:18 PM, Eric Firing <efiring at hawaii.edu> wrote:
> On 05/16/2010 09:24 AM, Keith Goodman wrote:
>> On Sun, May 16, 2010 at 12:14 PM, Davide Lasagna
>> <lasagnadavide at gmail.com>  wrote:
>>> Hi all,
>>> What is the fastest and lowest memory consumption way to compute this?
>>> y = np.arange(2**24)
>>> bases = y[1:] + y[:-1]
>>> Actually it is already quite fast, but i'm not sure whether it is occupying
>>> some temporary memory
>>> is the summation. Any help is appreciated.
>>
>> Is it OK to modify y? If so:
>>
>>>> y = np.arange(2**24)
>>>> z = y[1:] + y[:-1]  #<--- Slow way
>>>> y[:-1] += y[1:]  #<--- Fast way
>>>> (y[:-1] == z).all()
>>     True
>
>
> It's not faster on my machine, as timed with ipython:
>
> In [8]:y = np.arange(2**24)
>
> In [9]:b = np.array([1,1], dtype=int)
>
> In [10]:timeit np.convolve(y, b, 'valid')
> 1 loops, best of 3: 484 ms per loop
>
> In [11]:timeit y[1:] + y[:-1]
> 10 loops, best of 3: 181 ms per loop
>
> In [12]:timeit y[:-1] += y[1:]
> 10 loops, best of 3: 183 ms per loop
>
> If we include the fake data generation in the timing, to reduce cache
> bias in the repeated runs, the += method is noticeably slower.
>
> In [13]:timeit y = np.arange(2**24); z = y[1:] + y[:-1]
> 1 loops, best of 3: 297 ms per loop
>
> In [14]:timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1]
> 1 loops, best of 3: 322 ms per loop

That's interesting. On my computer it is faster:

>> timeit y = np.arange(2**24); z = y[1:] + y[:-1]
10 loops, best of 3: 144 ms per loop
>> timeit y = np.arange(2**24); y[:-1] += y[1:]; z = y[:-1]
10 loops, best of 3: 114 ms per loop

What accounts for the performance difference? Cache size?

I assume the in-place version uses less memory. Neat if timeit
reported memory usage. I haven't tried numexp, that might be something
to try too.