[pypy-dev] Calling lambdas inside loop causes significant slowdown

Mon Aug 4 18:06:29 CEST 2014

Hi,

I'm trying to figure out the fastest way in PyPy to introduce
abstractions into loops, e.g. refactoring the following code:

def sum_direct(data):
    s = 0
    for i in data:
        if i < 5:
            s += i + 1
    return s

to something like:

def sum_lambda(data):
    filter_func = lambda x: x < 5
    map_func = lambda x: x + 1

    s = 0
    for i in data:
        if filter_func(i):
            s += map_func(i)
    return s

and then turning both lambdas into arguments, class members and so on.
However, the refactoring mentioned above already introduces about 50% of
runtime overhead and is not getting better with further refactorings.
Shoudn't the tracing/inlining eliminate most of this overhead or is
there a mistake on my part?

I timed both methods on a large array:

from array import array
import time

data = array('i')
for i in xrange(100000000):
    data.append(i % 10)

t = time.time()
result = sum_lambda(data)  # or sum_direct
print result, time.time() - t

Calling sum_direct() takes about 0.43 seconds, sum_lambda() is at 0.64s
on average.
(I'm at changeset 72674:78d5d873a260 from Aug 3 2014, translated and run
on Ubuntu 14.04)

The JIT trace of the lambda code basically adds two force_token()
operations and potentially more expensive guards. Is there any chance to
avoid these without excessive metaprogramming? If no, which speedup
tricks (speaking of jit hooks, code generation, etc.) can you recommend
for implementing such APIs?

Thanks in advance,
Toni