Speed-up for loops

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Wed Sep 8 09:13:09 EDT 2010


On Tue, 07 Sep 2010 11:00:03 +0100, BartC wrote:

>> for i in xrange(100000000):
>>    a = a + f(i)
>>
>> then unrolling the loop is even less useful. The overhead of the loop
>> itself is likely to be trivial compared to the cost of calling f() 100
>> million times -- the added complexity to shave 3 seconds off a four
>> minute calculation simply isn't worth it.
> 
> With Python 3 and def f(x): return x+1, unrolling this loop 4x improved
> speed by 15%; 4.00 minutes reduces to 3.30 minutes.


I'm afraid that I can't replicate those figures. In my test, unrolling 
the loop causes a massive SLOWDOWN of 37%, not a speed up. Here is my 
test code:


def f(x):
    return x+1

def test_loop(n):
    a = 0
    for i in range(n):
        a += f(i)
    return a

def test_unrolled(n):
    a = 0
    for i in range(n//4):
        a += f(4*i)
        a += f(4*i+1)
        a += f(4*i+2)
        a += f(4*i+3)
    return a


from timeit import Timer
n = 10000000
assert test_loop(n) == test_unrolled(n)
t1 = Timer('test_loop(n)', 
  'from __main__ import test_loop; n=%d' % n)
t2 = Timer('test_unrolled(n)', 
  'from __main__ import test_unrolled; n=%d' % n)

And here are the results using Python 3.1. Times are the best of 5 for 10 
calls each to the loop_* functions, results given in seconds:

>>> min(t1.repeat(number=10, repeat=5))/10
5.97409288883209
>>> min(t2.repeat(number=10, repeat=5))/10
8.25656900405883


I repeated it with a larger loop variable. Since the time per test was so 
large (over ten minutes per call on my machine!), I didn't see the need 
to run multiple trials:

n *= 100
assert test_loop(n) == test_unrolled(n)
t3 = Timer('test_loop(n)', 
  'from __main__ import test_loop; n=%d' % n)
t4 = Timer('test_unrolled(n)', 
  'from __main__ import test_unrolled; n=%d' % n)

And the results:

>>> t3.timeit(number=1)
653.3572518825531
>>> t4.timeit(number=1)
864.6454808712006

That's slightly better (32% slower instead of 37% slower), but still a 
massive performance hit. Given these results, I'm prepared to say that 
loop unrolling in Python is almost certainly going to be a pessimation, 
not an optimization, no matter what you have inside the loop.



-- 
Steven



More information about the Python-list mailing list