[Numpy-discussion] Simple problem. Is it possible without a loop?

Bruce Southey bsouthey at gmail.com
Wed Jun 9 16:49:38 EDT 2010


On 06/09/2010 10:24 AM, Vicente Sole wrote:
>>> ? Well a loop or list comparison seems like a good choice to me. It is
>>> much more obvious at the expense of two LOCs. Did you profile the two
>>> possibilities and are they actually performance-critical?
>>>
>>> cheers
>>>
>>>        
>
> The second is between 8 and ten times faster on my machine.
>
> import numpy
> import time
> x0 = numpy.arange(10000.)
> niter = 2000   # I expect between 10000 and 100000
>
>
> def option1(x, delta=0.2):
>       y = [x[0]]
>       for value in x:
>           if (value - y[-1])>  delta:
>               y.append(value)
>       return numpy.array(y)
>
> def option2(x, delta=0.2):
>       y = numpy.cumsum((x[1:]-x[:-1])/delta).astype(numpy.int)
>       i1 = numpy.nonzero(y[1:]>   y[:-1])
>       return numpy.take(x, i1)
>
>
> t0 = time.time()
> for i in range(niter):
>       t = option1(x0)
> print "Elapsed = ", time.time() - t0
> t0 = time.time()
> for i in range(niter):
>       t = option2(x0)
> print "Elapsed = ", time.time() - t0
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>    
For integer arguments for delta, I don't see any different between using 
option1 and using the '%' operator.
 >>> (x0[(x0*10)%2==0]-option1(x0)).sum()
0.0

Also option2 gives a different result than option1 so these are not 
equivalent functions. You can see that from the shapes
 >>> option2(x0).shape
(1, 9998)
 >>> option1(x0).shape
(10000,)
 >>> ((option1(x0)[:9998])-option2(x0)).sum()
0.0

So, allowing for shape difference, option2 is the same for most of 
output from option1 but it is still smaller than option1.

Probably the main reason for the speed difference is that option2 is 
virtually pure numpy (and hence done in C) and option1 is using a lot of 
array lookups that are always slow. So keep it in numpy as most as possible.


Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20100609/ead48b82/attachment.html>


More information about the NumPy-Discussion mailing list