[Numpy-discussion] Numpy array performance issue

Wed Feb 24 11:28:45 EST 2010

On Wed, Feb 24, 2010 at 10:21, Bruno Santos <bacmsantos at gmail.com> wrote:

>> The idiomatic way of doing this for numpy arrays would be:
>>
>> def test2(arrx):
>>    return (arrx >= 10).sum()
>>
>  Even this versions takes more time to run than my original python version
> with arrays.

Works fine for me, and gets better as the size increases:

In [1]: N = 100

In [2]: import numpy as np

In [3]: A = np.random.randint(0, 21, N)

In [4]: L = A.tolist()

In [5]: %timeit len([e for e in L if e >= 10])
100000 loops, best of 3: 15 us per loop

In [6]: %timeit (A >= 10).sum()
100000 loops, best of 3: 12.7 us per loop

In [7]: N = 1000

In [8]: %macro mm 3 4 5 6
Macro `mm` created. To execute, type its name (without quotes).
Macro contents:
A = np.random.randint(0, 21, N)
L = A.tolist()
_ip.magic("timeit len([e for e in L if e >= 10])")
_ip.magic("timeit (A >= 10).sum()")

In [9]: mm
------> mm()
10000 loops, best of 3: 103 us per loop
100000 loops, best of 3: 17.6 us per loop

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco