[Python-ideas] Fast sum() for non-numbers

Andrew Barnert abarnert at yahoo.com
Wed Jul 3 06:32:11 CEST 2013


On Jul 2, 2013, at 11:12, Sergey <sergemp at mail.ru> wrote:

> Questions that may arise if the patch is accepted:
> * sum() was rejecting strings because of this bug. If the bug gets fixed
>  should another patch allow sum() to accept strings?

Does it actually speed up strings? I would have thought it only helps for types that have a mutating __iadd__ (and one that's faster than non mutating __add__, of course).

> * maybe in some distant future drop the second parameter (or make it
>  None by default) and allow calling sum for everything, making sum()
>  "the one obvious way" to sum up things?

sum can't guess the unified type of all of the elements in the iterable. The best it could do is what reduce does in that case: start with the first element, and add from there. That's not always the magical DWIM you seem to be expecting.

Most importantly, how could it possibly work for iterables that might be empty? ''.join(lines), or sum(lines, ''), will work when there are no lines; sum(lines) can't possibly know that you expected a string.

Meanwhile, if you're going to add optional "start from the first item" functionality, I think you'll also want to make the operator/function overridable. 

And then you've just re-invented reduce, with a slightly different signature:

    sum(iterable, start=None, function=operator.iadd):
        return reduce(function, iterable, start)

> 
> It would be nice if sum "just worked" for everything (e.g. sum() of
> empty sequence would return None, i.e. if there's nothing to sum then
> nothing is returned). But I think it needs more work for that, because
> even with this patch sum() is still ~20 times slower than "".join() [4]

You could always special case it when start is a str (or when start is defaulted and the first value is a str). But then what happens if the second value is something that can be added to a str, but not actually a str? Or, for that matter, the start value?


More information about the Python-ideas mailing list