[Python-ideas] Fast sum() for non-numbers - why so much worries?

Steven D'Aprano steve at pearwood.info
Thu Jul 11 04:36:59 CEST 2013


On 11/07/13 04:21, Joshua Landau wrote:
> On 10 July 2013 17:29, Steven D'Aprano <steve at pearwood.info> wrote:
>> On 10/07/13 16:09, Joshua Landau wrote:
>>>
>>> On 9 July 2013 17:13, Steven D'Aprano <steve at pearwood.info> wrote:
>>
>> [...]
>>
>>>> Nevertheless, you are right, in Python 3 both + and sum of lists is
>>>> well-defined. At the moment sum is defined in terms of __add__. You want
>>>> to
>>>> change it to be defined in terms of __iadd__. That is a semantic change
>>>> that
>>>> needs to be considered carefully, it is not just an optimization.
>>>
>>>
>>> I agree it's not totally backward-compatible, but AFAICT that's only
>>> for broken code. __iadd__ should always just be a faster, in-place
>>> __add__ and so this change should never cause problems in properly
>>> written code.
>>
>>
>> "Always"? Immutable objects cannot define __iadd__ as an in-place __add__.
>>
>> In any case, sum() currently does not modify the start argument in-place.
>
> Now you're just (badly) playing semantics. If I say that gills are
> always like lungs except they work underwater, would you contradict me
> by stating that mammals don't have gills?

I don't actually understand your objection. You made a general statement that __iadd__ should ALWAYS be an in-place add, and I pointed out that this cannot be the case for immutable classes. What is your objection? Surely you're not suggesting that immutable classes can define in-place __iadd__? That's not a rhetorical question, I do not understand the point you are trying to make. Perhaps you should be explicit, rather than argue by analogy.

[Aside: it's a poor analogy. Gills are not like lungs, they differ greatly in many ways, e.g. fluid flow is unidirectional in gills, bidirectional in lungs, the interface to the blood system is counter-current in gills, with an efficiency of about 80%, versus concurrent in lungs, with an efficiency around 25%. There are other significant differences too, and lungs evolved independently of gills, which is why lungfish have both.]


>>> That makes it anything but a semantic change.
>>
>> __iadd__ is optional for classes that support addition. Failure to define an
>> __iadd__ method does not make your class broken.
>>
>> Making __iadd__ mandatory to support sum would be a semantic change, since
>> there will be objects (apart from strs and bytes, which are special-cased)
>> that support addition with + but will no longer be summable since they don't
>> define __iadd__.
>
> Why are you saying these things? I never suggested anything like that.

You want to change sum from using __add__ to __iadd__. That means that there are two possibilities: for a class to be summable, either __iadd__ is mandatory, or it is optional with a fallback to __add__. I considered both possibilities, and they both result in changes to the behaviour of sum, that is, a semantic change.

If __iadd__ becomes mandatory, then some currently summable classes will become non-summable.

If __iadd__ becomes optional, but preferred over __add__, then some currently summable classes will change their behaviour (although you call those classes "broken").

In either case, this is a semantic change to sum, which is what you explicitly denied.

I think that it is a reasonable position to take that we should not care about "broken" classes that define __iadd__ differently to __add__. I'm not sure that I agree, but regardless, it is a reasonable position. But arguing that the proposed change from __add__ to __iadd__ is not a semantic change to sum is simply unreasonable.


>> Even making __iadd__ optional will potentially break working code. Python
>> doesn't *require* that __iadd__ perform the same operation as __add__. That
>> is the normal expectation, of course, but it's not enforced. (How could it
>> be?) We might agree that objects where __add__ and __iadd__ do different
>> things are "broken" in some sense, but you're allowed to write broken code,
>> and Python should (in principle) avoid making it even more broken by
>> changing behaviour unnecessarily. But maybe the right answer there is simply
>> "don't call sum if you don't want __iadd__ called".
>
> Python has previously had precedents where broken code does not get to
> dictate the language as long as that code was very rare. This is more
> than very rare. Additionally, Python does (unclearly, but it does do
> so) define __iadd__ to be an inplace version of __add__, so the code
> isn't just “broken” -- it's broken.

Not so. The docs for __iadd__ and other augmented assignment operators state:

"These methods are called to implement the augmented arithmetic assignments (+=, -=, *=, /=, //=, %=, **=, <<=, >>=, &=, ^=, |=). These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self)."

So, according to the docs, "x += y" might modify x in place and return a different instance, or even a completely different value. It is normal, and expected, for "x += y" to be the same as "x = x + y", but not compulsory. Python will fall back on the usual __add__ if __iadd__ is not defined, but (say) if you define your own DSL where += has some distinct meaning, you are free to define it to do something completely different.

You consider it broken if a class defines += differently to +. I consider it unusual, but permitted. I believe the docs support my interpretation.

http://docs.python.org/release/3.1/reference/datamodel.html#object.__iadd__

E.g. I have a DSL where = reassigns to a data structure, += appends to an existing one, and + is not defined at all. You can say "x += value" but not "x = x + value". It makes sense in context. As I said, I am prepared to consider that the right answer to this is "well don't call sum on your data structure then", but it is a change in behaviour, not just an optimization.



-- 
Steven


More information about the Python-ideas mailing list