sum for sequences?

Steve Howell showell30 at yahoo.com
Sun Mar 28 10:16:10 EDT 2010


On Mar 27, 3:18 am, Steven D'Aprano <st... at REMOVE-THIS-
cybersource.com.au> wrote:
> On Fri, 26 Mar 2010 07:31:17 -0700, Steve Howell wrote:
> > From a purely academic standpoint, I'm not convinced that sum() is
> > inefficient in terms of big-O complexity, though.
>
> >  showell at showell-laptop:~$ python
> >  Python 2.6.2 (release26-maint, Apr 19 2009, 01:56:41) [GCC 4.3.3] on
> >  linux2
> >  >>> class StupidList:
>
> [...]
>
> But it's not *sum* that is inefficient, it is sum *with a particular data
> structure*.
>

Yep, the implied context was with particular data structures.

> Sure, if you create your own data structure, you can make it as efficient
> as you like. Obviously the sum algorithm itself has to perform one
> addition per item, or O(N), which scales tolerably well. But each
> addition has a cost. If the cost is constant, then sum() as a whole
> remains O(N). But if the cost of addition varies with N, sum() degrades
> ba
>

The surprising part of sum() is not that the outer loop to do the sums
is O(N).  It is hard to imagine any other implementation (without
parallelizing it).

The mildly surprising part of sum() is that is does add vs. add-in-
place, which leads to O(N) vs. O(1) for the inner loop calls, for
certain data structures, notably lists, even though none of the
intermediate results get used by the caller.  For lists, you could
make a more efficient variant of sum() that clones the start value and
does add-in-place.

I could guess pretty confidently that the reason this optimization was
never tried is that sum() has always been intended to be used on
numerics, since other alternatives exist for strings (join), lists
(chain), and hand-coded data classes that support add-in-place (roll-
your-own loop).

The documentation is pretty clear on the intention that sum() is
intended for numbers:

http://docs.python.org/library/functions.html#sum

Except for strings, the docs are not explicit about efficiency
concerns for other data structures, or the fact that the reference
implementation does add vs. add-in-place under the hood.




http://docs.python.org/library/functions.html#sum



More information about the Python-list mailing list