genexp performance problem?

Peter Otten __peter__ at web.de
Wed May 31 03:26:31 EDT 2006


Giovanni Bajo wrote:

> I found this strange:
> 
> python -mtimeit "sum(int(L) for L in xrange(3000))"
> 100 loops, best of 3: 5.04 msec per loop
> 
> python -mtimeit "import itertools; sum(itertools.imap(int, xrange(3000)))"
> 100 loops, best of 3: 3.6 msec per loop
> 
> I thought the two constructs could achieve the same speed.

I think early binding would have been preferable, but as Fredrik Lundh said,
int is looked up for every iteration which accounts for the slowdown. 

For reference:
$ python -m timeit "sum(int(i) for i in xrange(3000))"
1000 loops, best of 3: 1.92 msec per loop
$ python -m timeit -s "from itertools import imap" "sum(imap(int,
xrange(3000)))"
1000 loops, best of 3: 1.17 msec per loop

You can shave off a few percent by turning int into a local variable:
$ python -m timeit -s "int_ = int" "sum(int_(i) for i in xrange(3000))"
1000 loops, best of 3: 1.74 msec per loop

On the other hand the function call overhead that imap() sometimes enforces
is larger than the cost of a symbol lookup:
$ python -m timeit -s"def square(i): return i*i" -s"from itertools import
imap" "sum(imap(square, xrange(3000)))"
100 loops, best of 3: 2.25 msec per loop
$ python -m timeit "sum(i*i for i in xrange(3000))"
1000 loops, best of 3: 1.29 msec per loop

Peter



More information about the Python-list mailing list