[Python-ideas] Intermediate Summary: Fast sum() for non-numbers

Joshua Landau joshua at landau.ws
Mon Jul 15 09:16:05 CEST 2013


On 14 July 2013 21:42, David Mertz <mertz at gnosis.cx> wrote:
> On Sun, Jul 14, 2013 at 12:26 PM, Sergey <sergemp at mail.ru> wrote:
>>
>> * Sum is not obvious (for everyone) way to add lists, so people
>>   should not use it, as there're alternatives, i.e. instead of
>>   - sum(list_of_lists, [])
>>   one can use:
>>   - reduce(operator.iadd, list_of_lists, [])
>>   - list(itertools.chain.from_iterable(list_of_lists))
>>   - result = []
>>     for x in list_of_lists:
>>         result.extend(x)
>
>
> It seems to me that in order to make sum() look more attractive, Sergey
> presents ugly versions of alternative ways to (efficiently) concatenate
> sequences.
>
> One can make these look much nicer, e.g. (assuming there is a 'from
> itertools import chain' at the very top of the file, which is the sensible
> place to put it).
>
>   # If 'list_of_lists' really is as it is named, there is no need to treat
> it
>   # as generic iterable.  Moreover, one doesn't usually need to make an
>   # actual instantiated list from chain() for most purposes.  So:
>   flat = chain(list_of_lists)

This does nothing more that iter(list)...

>   # If we do start with an iterable of lists, but know it isn't infinite,
> just use:
>   flat = chain(*iter_of_lists)

Just use chain.from_iterable(...) here too, it might be longer but
it's more flexible and has almost no downsides. This just does
redundant work for the sake of saving a few characters.

> If it is really needed, of course chain.from_iterable() can be used.
> Although the only time you'd want that is when the iterable is potentially
> infinite, and in that case you *definitely* don't want to make it back into
> a list either, just:
>
>   inf_flat = chain.from_iterable(endless_lists)
>
> Another approach in one of the links Sergey gave is nice too, and shorter
> and more elegant than any of his alternatives:
>
>   flat = []
>   map(flat.extend, list_of_lists)

Gah! No like.

flat = []
for lst in list_of_lists:
    flat.extend(lst)

is no longer and also doesn't force you to "deque(maxlen=0).extend(...)" it.

> Using map() for a side effect is slightly wrong, but this is short,
> readable, and obvious in purpose.

I disagree somewhat.

> On the other hand, as I've said before, when I read:
>
>   flat = sum(list_of_lists, [])
>
> It just looks WRONG!

I definitely don't think this is nearly as bad as map(flat.extend,
list_of_lists); not only is this *defined* to work

> Yes, I know why it works, because of some quirks of
> Python internals.

You think that "[1, 2, 3] + [4, 5, 6] == [1, 2, 3, 4, 5, 6]" is a
quirk of python's internals?

> But it absolutely doesn't *read* like it should mean what
> it does

You can look up the term "sum" -- it absolutely does.

> or that it should necessarily even work at all.  The word SUM is
> self-evidently and intuitively about *adding numbers*

No it's not.

> and *not* about "doing
> something that is technically supported because other things have an
> .__add__() method".

Again, this is wrong.

https://en.wikipedia.org/wiki/Summation
> Besides numbers, other types of values can be added as well: vectors, matrices, polynomials and, in general, elements of any additive group (or even monoid).

Google's (aggregated) dictionary:
> The total amount resulting from the addition of two or more numbers, amounts, or items
> the final aggregate; "the sum of all our troubles did not equal the misery they suffered" (a good example of where you *already know* you can sum things other than numbers*)

So first tell me why it makes sense to sum "misery" but not "lists".
How is "misery" more like a number than a "list"?

> As various people have observed, if Python used some other operator for
> concatenation, we wouldn't be having this discussion at all.  E.g. if we
> had:
>
>   concat = [1, 2, 3] . [4, 5, 6]
>
> Then we might have a method called .__concat__() on various collections.
> Conceptually that really is what Python is doing now.  It's just that Guido
> made the very reasonable decision that the symbol "+" was something users
> could intuitively read as meaning concatenation when appropriate, but as
> addition in other cases.
>
> I definitely don't prefer some other operator than '+' to concatenate
> sequences.  However, I think possibly if I had a time machine I might go
> back and change the spelling of .__add__() to .__plus__().  That might more
> clearly indicate that we don't really mean "mathematical addition" but
> rather simply "what the plus sign does".

I agree with none of this (except the start: if Python used some other
operator we'd only be having *different* discussions).


More information about the Python-ideas mailing list