[Numpy-discussion] numpy 10x slower than native Python arrays for simple operations?

Sat Feb 6 19:07:24 EST 2010

la, 2010-02-06 kello 16:21 -0500, Joseph Turian kirjoitti:
> I have done some profiling, and the results are completely
> counterintuitive. For simple array access operations, numpy and
> array.array are 10x slower than native Python arrays.
> 
> I am using numpy 1.3.0, the standard Ubuntu 9.03 package.
> 
> Why am I getting such slow access speeds?
> Note that for "array access", I am doing operations of the form:
>     a[i] += 1
> 
> Profiles:
> 
> [0] * 20000000
>    Access: 2.3M / sec
>    Initialization: 0.8s
> 
> numpy.zeros(shape=(20000000,), dtype=numpy.int32)
>    Access: 160K/sec
>    Initialization: 0.2s

The speed difference comes here from the fact that

	a[i] += 1

effectively calls numpy.core.umath.add(a[i], 1, a[i]). Since it is
designed to handle operations on arrays, and at the moment there is no
short-circuit for 1-d numbers, it has a fixed overhead that is larger
than for Python's simple number+number addition.

In vectorized operations the overhead does not matter, but changing a
single element at a time makes it show.

If `i` is an index vector, Numpy has faster per-element access times,

In [1]: import numpy as np

In [2]: a = np.zeros((2000000,), 'i4')

In [3]: b = [0] * 2000000

In [5]: i = np.arange(0, 2000000, 5)

In [8]: %timeit b[0] += 1
1000000 loops, best of 3: 260 ns per loop

In [20]: %timeit a[i] += 1
10 loops, best of 3: 71.2 ms per loop

In [25]: 71.2e-3/len(i)
Out[25]: 1.7800000000000001e-07

ie., 178 ns per element

-- 
Pauli Virtanen