numpy indexing performance

Wed Mar 15 10:19:46 EDT 2017

This is a simplified example of a Monte Carlo
simulation where random vectors (here 2D vectors,
which are all zero) are summed (the result is in
r1 and r2 or r, respectively):

def case1():
    import numpy as np
    M = 100000
    N = 10000
    r1 = np.zeros(M)
    r2 = np.zeros(M)
    s1 = np.zeros(N)
    s2 = np.zeros(N)
    for n in range(1000):
        ind = np.random.random_integers(N, size=M) - 1
        r1 += s1[ind]
        r2 += s2[ind]

def case2():
    import numpy as np
    M = 100000
    N = 10000
    r = np.zeros((M, 2))
    s = np.zeros((N, 2))
    for n in range(1000): 
        ind = np.random.random_integers(N, size=M) - 1
        r += s[ind]

import timeit

print("case1:", timeit.timeit(
    "case1()", setup="from __main__ import case1", number=1))
print("case2:", timeit.timeit(
    "case2()", setup="from __main__ import case2", number=1))

Resulting in:

case1: 2.6224704339983873
case2: 4.374910838028882

Why is case2 significantly slower (almost by a
factor of 2) than case1? There should be the same number
of operations (additions) in both cases; the main
difference is the indexing.

Is there another (faster) way to avoid the separate
component arrays r1 and r2? (I also tried
r = np.zeros(M, dtype='2d'), which was comparable
to case2.)

Olaf