Why is indexing into an numpy array that slow?

Mon Nov 10 19:37:10 EST 2008

Robert Kern wrote:
> Rüdiger Werner wrote:
>> Hello!
> 
> Hi!
> 
> numpy questions are best answered on the numpy mailing list.
> 
>   http://www.scipy.org/Mailing_Lists
> 
>> Out of curiosity and to learn a little bit about the numpy package 
>> i've tryed to implement
>> a vectorised version of the 'Sieve of Zakiya'.
>>
>> While the code itself works fine it is astounding for me that the 
>> numpy Version is almost 7 times slower than
>> the pure python version. I tryed to find out if i am doing something 
>> wrong but wasn't able to find any answer.
> 
> The result of indexing into a numpy array is a numpy scalar object. We 
> do this instead of returning a float or an int because numpy supports 
> many more data types than just a C double or long, respectively. If I 
> have a uint16 array, indexing into it gives me a uint16 numpy scalar. 
> These are a little more complicated to set up than a regular Python 
> float or int, so they take more time to create.

Taking a closer look, that's only part of the problem. The larger problem is 
that the numpy code wasn't vectorized enough to actually make use of numpy's 
capabilities. You were paying for the overhead numpy imposes without coding in 
such a way to make use of numpy's efficiencies. Here is my version that is more 
than two times faster than the others. I didn't know exactly which details were 
critical to the algorithm and which weren't (e.g. lndx+15), so there's some 
gorpiness that can probably be removed. If you don't actually need to convert 
back to a list, then you can be about 3 times faster than the others.

from numpy import arange, array, newaxis, zeros
def VSoZP5(val):
     """ Vectorised Version of Sieve of Zakia"""
     if val < 3 : return [2]
     if val < 5 : return [2,3]
     lndx = (val-1)/2
     sieve = zeros(lndx+15,dtype="int32")
     sieve[0:14] =  0 , 1 , 2, 0, 4, 5, 0, 7, 8, 0, 10, 0, 0, 13
     _i = arange(14, lndx, 15)
     if lndx > 14:
         sieve[14:-14:15] = _i - 1
         sieve[17:-11:15] = _i + 2
         sieve[19:-9:15] = _i + 4
         sieve[20:-8:15] = _i + 5
         sieve[22:-6:15] = _i + 7
         sieve[23:-5:15] = _i + 8
         sieve[25:-3:15] = _i + 10
         sieve[28::15] = _i + 13
     om = array([9, 15, 18, 24, 27, 33, 42, 45], dtype="int32")
     bm = array([7, 11, 13, 17, 19, 23, 29, 31], dtype="int32")
     for i in xrange(2, (int(sqrt(val))-3/2)+1):
         if not sieve[i]:  continue
         k = 30*i+45
         rm = (bm * i) + om
         rm2 = (rm + k * arange(0, lndx/k + 1)[:,newaxis]).flat
         rm3 = rm2[rm2 < len(sieve)]
         sieve[rm3] = 0
     sieve = sieve[2:lndx]
     primes = (2*arange(2, lndx) + 3)[sieve > 0].tolist()
     primes[0:0] = [2,3,5]
     return primes

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
  that is made terrible by our own mad attempt to interpret it as though it had
  an underlying truth."
   -- Umberto Eco