[SciPy-user] Very slow comparison of arrays of integers

Robert Kern rkern at ucsd.edu
Mon Jul 18 04:15:47 EDT 2005


Brian Granger wrote:
> Hello all,
> 
> I have some code that is using scipy/numeric and one bottleneck in  the 
> code consists of comparing arrays or lists of integers.  To my  dismay, 
> I am finding that using Python lists is 15-20 times _faster_  than using 
> numeric array's for this.  The problem is that it seems  that there is 
> no efficient way to compare arrays of integers.
> 
> Here is code that clearly demonstrates this problem:
> 
> from scipy import *
> 
> def test_list(n):
>     a = range(100)
>     for i in range(n):
>         r = (a == a)
> 
> def test_array(n):
>     a = array(range(100),Int)
>     for i in range(n):
>         r = allclose(a,a)
> 
> The test_list code runs about 20 times as fast as the test_array code  
> that uses allclose().
> 
> Is there any way of comparing to arrays of integers that would be as  
> fast or faster than using lists?  Any hints would be greatly  appreciated.

allclose() is for floating point arrays. Use alltrue(x == y) for integer 
arrays.

Also, use numbers > 100. Small integer objects are cached and object 
identity is checked first, I believe.

In [28]: tlist = timeit.Timer("a == a", setup="a = range(1000, 2000)")
In [29]: tarray = timeit.Timer("alltrue(a == a)", setup="from scipy 
import alltrue,arange; a = arange(1000, 2000)")

In [30]: tarray.repeat(3, 1000)
Out[30]: [0.097441911697387695, 0.049738168716430664, 0.051641941070556641]

In [31]: tlist.repeat(3, 1000)
Out[31]: [0.1062159538269043, 0.062600851058959961, 0.063454866409301758]

-- 
Robert Kern
rkern at ucsd.edu

"In the fields of hell where the grass grows high
  Are the graves of dreams allowed to die."
   -- Richard Harter




More information about the SciPy-User mailing list