numpy indexing performance

Tue Mar 21 05:46:17 EDT 2017

Olaf Dietrich <odt at dtrx.de>:
> This is a simplified example of a Monte Carlo
> simulation where random vectors (here 2D vectors,
> which are all zero) are summed (the result is in
> r1 and r2 or r, respectively):
> 
> def case1():
>     import numpy as np
>     M = 100000
>     N = 10000
>     r1 = np.zeros(M)
>     r2 = np.zeros(M)
>     s1 = np.zeros(N)
>     s2 = np.zeros(N)
>     for n in range(1000):
>         ind = np.random.random_integers(N, size=M) - 1
>         r1 += s1[ind]
>         r2 += s2[ind]
> 
> def case2():
>     import numpy as np
>     M = 100000
>     N = 10000
>     r = np.zeros((M, 2))
>     s = np.zeros((N, 2))
>     for n in range(1000): 
>         ind = np.random.random_integers(N, size=M) - 1
>         r += s[ind]
> 
> import timeit
> 
> print("case1:", timeit.timeit(
>     "case1()", setup="from __main__ import case1", number=1))
> print("case2:", timeit.timeit(
>     "case2()", setup="from __main__ import case2", number=1))
> 
> 
> Resulting in:
> 
> case1: 2.6224704339983873
> case2: 4.374910838028882

I should add that I tried this with Python 3.4.2 and with
Python 2.7.0 (on linux, x64), NumPy version 1.8.2; the
performance differences were the same with both Python
interpreters.

> Why is case2 significantly slower (almost by a
> factor of 2) than case1? There should be the same number
> of operations (additions) in both cases; the main
> difference is the indexing.
> 
> Is there another (faster) way to avoid the separate
> component arrays r1 and r2? (I also tried
> r = np.zeros(M, dtype='2d'), which was comparable
> to case2.)

Olaf