[Python-ideas] Fast sum() for non-numbers - why so much worries?

Joshua Landau joshua at landau.ws
Thu Jul 11 05:05:51 CEST 2013


On 11 July 2013 03:36, Steven D'Aprano <steve at pearwood.info> wrote:
> On 11/07/13 04:21, Joshua Landau wrote:
>>
>> On 10 July 2013 17:29, Steven D'Aprano <steve at pearwood.info> wrote:
>>>
>>> On 10/07/13 16:09, Joshua Landau wrote:
>>>>
>>>>
>>>> On 9 July 2013 17:13, Steven D'Aprano <steve at pearwood.info> wrote:
>>>
>>>
>>> [...]
>>>
>>>>> Nevertheless, you are right, in Python 3 both + and sum of lists is
>>>>> well-defined. At the moment sum is defined in terms of __add__. You
>>>>> want
>>>>> to
>>>>> change it to be defined in terms of __iadd__. That is a semantic change
>>>>> that
>>>>> needs to be considered carefully, it is not just an optimization.
>>>>
>>>>
>>>>
>>>> I agree it's not totally backward-compatible, but AFAICT that's only
>>>> for broken code. __iadd__ should always just be a faster, in-place
>>>> __add__ and so this change should never cause problems in properly
>>>> written code.
>>>
>>>
>>>
>>> "Always"? Immutable objects cannot define __iadd__ as an in-place
>>> __add__.
>>>
>>> In any case, sum() currently does not modify the start argument in-place.
>>
>>
>> Now you're just (badly) playing semantics. If I say that gills are
>> always like lungs except they work underwater, would you contradict me
>> by stating that mammals don't have gills?
>
>
> I don't actually understand your objection. You made a general statement
> that __iadd__ should ALWAYS be an in-place add,

Which it should.

> and I pointed out that this
> cannot be the case for immutable classes. What is your objection? Surely
> you're not suggesting that immutable classes can define in-place __iadd__?

Of course not.

> That's not a rhetorical question, I do not understand the point you are
> trying to make. Perhaps you should be explicit, rather than argue by
> analogy.

Rather, I shall explain by analogy :P.

I said that __iadd__ should always be a faster __add__. This is saying
that all <a> have property <b>, aka. the analogy of stating that all
gills are like lungs (except [that] they work underwater).

You objected by saying there are some things that cannot implement
__iadd__. This has *no correlation* to the previous statement - that
was about the properties that all __iadd__ have. That is why I made an
analogy of you contradicting me by saying that mammals don't have
gills.

So basically, I made no claim in any way about what objects have
__iadd__, but about what __iadd_ does (which is of course for only
those circumstances where it applies -- I know that's a tautology but
this whole sub-discussion seems to be one).

> [Aside: it's a poor analogy. Gills are not like lungs, they differ greatly
> in many ways,

This isn't really relevant, but alas they *are* like lungs. Sure, it's
an imperfect relation, but that's why I said "like lungs" and not "are
lungs".

> e.g. fluid flow is unidirectional in gills, bidirectional in
> lungs, the interface to the blood system is counter-current in gills, with
> an efficiency of about 80%, versus concurrent in lungs, with an efficiency
> around 25%. There are other significant differences too, and lungs evolved
> independently of gills, which is why lungfish have both.]

I honestly didn't know that. Interesting.


>>>> That makes it anything but a semantic change.
>>>
>>>
>>> __iadd__ is optional for classes that support addition. Failure to define
>>> an
>>> __iadd__ method does not make your class broken.
>>>
>>> Making __iadd__ mandatory to support sum would be a semantic change,
>>> since
>>> there will be objects (apart from strs and bytes, which are
>>> special-cased)
>>> that support addition with + but will no longer be summable since they
>>> don't
>>> define __iadd__.
>>
>>
>> Why are you saying these things? I never suggested anything like that.
>
>
> You want to change sum from using __add__ to __iadd__. That means that there
> are two possibilities: for a class to be summable, either __iadd__ is
> mandatory, or it is optional with a fallback to __add__. I considered both
> possibilities, and they both result in changes to the behaviour of sum, that
> is, a semantic change.
>
> If __iadd__ becomes mandatory, then some currently summable classes will
> become non-summable.

I don't believe that was ever suggested; there is a good reason "+="
falls back on "+" by default.

> If __iadd__ becomes optional, but preferred over __add__, then some
> currently summable classes will change their behaviour (although you call
> those classes "broken").

That is what I was doing - calling them broken.

> In either case, this is a semantic change to sum, which is what you
> explicitly denied.

I'm not sure not supporting broken code counts as a semantic change.
That is what I was debating.

> I think that it is a reasonable position to take that we should not care
> about "broken" classes that define __iadd__ differently to __add__. I'm not
> sure that I agree, but regardless, it is a reasonable position. But arguing
> that the proposed change from __add__ to __iadd__ is not a semantic change
> to sum is simply unreasonable.

But that is what I am doing :P. If a spec is undefined, you don't
require results to be consistent. This is what would happen. That
changes nothing, as far as I am concerned -- and hence is not a
semantic change.

>>> Even making __iadd__ optional will potentially break working code. Python
>>> doesn't *require* that __iadd__ perform the same operation as __add__.
>>> That
>>> is the normal expectation, of course, but it's not enforced. (How could
>>> it
>>> be?) We might agree that objects where __add__ and __iadd__ do different
>>> things are "broken" in some sense, but you're allowed to write broken
>>> code,
>>> and Python should (in principle) avoid making it even more broken by
>>> changing behaviour unnecessarily. But maybe the right answer there is
>>> simply
>>> "don't call sum if you don't want __iadd__ called".
>>
>>
>> Python has previously had precedents where broken code does not get to
>> dictate the language as long as that code was very rare. This is more
>> than very rare. Additionally, Python does (unclearly, but it does do
>> so) define __iadd__ to be an inplace version of __add__, so the code
>> isn't just “broken” -- it's broken.
>
>
> Not so. The docs for __iadd__ and other augmented assignment operators
> state:
>
> "These methods are called to implement the augmented arithmetic assignments
> (+=, -=, *=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=). These methods should
> attempt to do the operation in-place (modifying self) and return the result
> (which could be, but does not have to be, self)."
>
> So, according to the docs, "x += y" might modify x in place and return a
> different instance, or even a completely different value. It is normal, and
> expected, for "x += y" to be the same as "x = x + y", but not compulsory.
> Python will fall back on the usual __add__ if __iadd__ is not defined, but
> (say) if you define your own DSL where += has some distinct meaning, you are
> free to define it to do something completely different.

I read it differently. I am not sure why one would do anything other
that return self, but I also read "to implement the augmented
arithmetic assignments" and "should attempt to do the operation
in-place (modifying self) and return the result". The final qualifier
only applies to circumstances, as far as I can glean, that are still
*attempted in-place additions*. Good examples could be when in-place
attempts fail and it would be faster to do __add__ right there and
then. Other good examples are taking a while to come, but this is
quite a niche area.

> You consider it broken if a class defines += differently to +. I consider it
> unusual, but permitted. I believe the docs support my interpretation.
>
> http://docs.python.org/release/3.1/reference/datamodel.html#object.__iadd__
>
> E.g. I have a DSL where = reassigns to a data structure, += appends to an
> existing one, and + is not defined at all. You can say "x += value" but not
> "x = x + value". It makes sense in context. As I said, I am prepared to
> consider that the right answer to this is "well don't call sum on your data
> structure then", but it is a change in behaviour, not just an optimization.

That is... really quite a good argument. I think I may have to think
on that final point, but you've probably just about won it. Why didn't
you just say this from the start?


More information about the Python-ideas mailing list