Feature suggestion: sum() ought to use a compensated summation algorithm

Duncan Booth duncan.booth at invalid.invalid
Sun May 4 11:58:25 EDT 2008


Szabolcs Horvát <szhorvat at gmail.com> wrote:

> I thought that it would be very nice if the built-in sum() function used 
> this algorithm by default.  Has this been brought up before?  Would this 
> have any disadvantages (apart from a slight performance impact, but 
> Python is a high-level language anyway ...)?

There's a thread you might find interesting:

http://groups.google.com/group/comp.lang.python/browse_thread/thread/9eda29faf92f532e/027cef7d4429aa3a

In that thread I suggested that Python ought to implement sum by adding 
together each pair of values, then each pair of results and so on. This 
means that for input values of a similar magnitude the intermediate results 
stay balanced. The implementation doesn't actually do all the first level 
sums first, it builds up a stack containing the sum of 2^k, 2^(k-1)...2 
values. Also it works for other types as well, and for strings or lists has 
the advantage of minimising the number of large string or list concatenations.

If you follow that thread far enough you'll find a post by Jussi Salmela 
which shows that summing adjacent pairs performs as well as Kahan summation
(he says 'almost as good' but in fact the only time they get different 
results the 'correct' answer is 500000.000005 and Kahan and sumpairs get 
the two nearest representable results either side: 500000.00000499998 and 
500000.00000500004 respectively).

I never did get round to re-coding my sumpairs() function in C, I would be 
interested to see just how much overhead it introduces (and I suspect it
would be faster than the existing sum in some cases such as for lists).



More information about the Python-list mailing list