Profiling, sum-comprehension vs reduce

Sat Sep 13 16:27:09 EDT 2008

Terry Reedy <tjreedy at udel.edu> writes:

>> Of course, sum() is even faster than reduce:
>>
>>>>> Timer('sum(xrange(10000))').repeat(number=10000)
>> [9.814924955368042, 8.7169640064239502, 9.5062401294708252]
>
> 'Of course', because the irreducible difference between
> reduce(add.seq) and sum(seq) is that reduce has to call an add
> function while sum has the operation built-in in place of a call.

Note that, despite appearances, it's not as built-in as one might
wish.  sum(seq) is still completely generic and works on all
number-like objects (in fact on all objects that define an __add__
operation except strings, which are explicitly forbidden).  This means
that it can't just go ahead and execute a C addition, it must properly
call PyNumber_Add (the C API call equivalent to Python's "+"
operator), which will then inspect the objects and invoke the
appropriate implementation of addition.  On the other hand,
operator.add is defined to do exactly that -- call PyNumber_Add on its
arguments and return the result.

It's not entirely obvious why summing numbers using the same method,
repeatedly calling PyNumber_Add, would be significantly slower for
reduce than for sum.  Admittedly reduce has to execute a Python-level
function call which requires allocating and filling out a tuple, only
to reach PyNumber_Add, which sum calls immediately.  But
builtin_reduce() is actually careful to optimize those repeated
function calls, for example by reusing the same tuple across the loop.