[Python-Dev] PEP 590 discussion

Mark Shannon mark at hotpy.org
Sat Apr 27 05:47:46 EDT 2019


Hi Petr,

On 24/04/2019 11:24 pm, Petr Viktorin wrote:
> On 4/10/19 7:05 PM, Jeroen Demeyer wrote:
>> On 2019-04-10 18:25, Petr Viktorin wrote:
>>> Hello!
>>> I've had time for a more thorough reading of PEP 590 and the reference
>>> implementation. Thank you for the work!
>>
>> And thank you for the review!
>>
>>> I'd now describe the fundamental
>>> difference between PEP 580 and PEP 590 as:
>>> - PEP 580 tries to optimize all existing calling conventions
>>> - PEP 590 tries to optimize (and expose) the most general calling
>>> convention (i.e. fastcall)
>>
>> And PEP 580 has better performance overall, even for METH_FASTCALL. 
>> See this thread:
>> https://mail.python.org/pipermail/python-dev/2019-April/156954.html
>>
>> Since these PEPs are all about performance, I consider this a very 
>> relevant argument in favor of PEP 580.
> 
> All about performance as well as simplicity, correctness, testability, 
> teachability... And PEP 580 touches some introspection :)
> 
>>> PEP 580 also does a number of other things, as listed in PEP 579. But I
>>> think PEP 590 does not block future PEPs for the other items.
>>> On the other hand, PEP 580 has a much more mature implementation -- and
>>> that's where it picked up real-world complexity.
>> About complexity, please read what I wrote in
>> https://mail.python.org/pipermail/python-dev/2019-March/156853.html
>>
>> I claim that the complexity in the protocol of PEP 580 is a good 
>> thing, as it removes complexity from other places, in particular from 
>> the users of the protocol (better have a complex protocol that's 
>> simple to use, rather than a simple protocol that's complex to use).
> 
> I think we're talking past each other. I see now it as:
> 
> PEP 580 takes existing complexity and makes it available to all users, 
> in a simpler way. It makes existing code faster.
> 
> PEP 590 defines a new simple/fast protocol for its users, and instead of 
> making existing complexity faster and easier to use, it's left to be 
> deprecated/phased out (or kept in existing classes for backwards 
> compatibility). It makes it possible for future code to be faster/simpler.
> 
> I think things should be simple by default, but if people want some 
> extra performance, they can opt in to some extra complexity.
> 
> 
>> As a more concrete example of the simplicity that PEP 580 could bring, 
>> CPython currently has 2 classes for bound methods implemented in C:
>> - "builtin_function_or_method" for normal C methods
>> - "method-descriptor" for slot wrappers like __eq__ or __add__
>>
>> With PEP 590, these classes would need to stay separate to get maximal 
>> performance. With PEP 580, just one class for bound methods would be 
>> sufficient and there wouldn't be any performance loss. And this 
>> extends to custom third-party function/method classes, for example as 
>> implemented by Cython.
> 
> Yet, for backwards compatibility reasons, we can't merge the classes.
> Also, I think CPython and Cython are exactly the users that can trade 
> some extra complexity for better performance.
> 
>>> Jeroen's analysis from
>>> https://mail.python.org/pipermail/python-dev/2018-July/154238.html seems
>>> to miss a step at the top:
>>>
>>> a. CALL_FUNCTION* / CALL_METHOD opcode
>>>        calls
>>> b. _PyObject_FastCallKeywords()
>>>        which calls
>>> c. _PyCFunction_FastCallKeywords()
>>>        which calls
>>> d. _PyMethodDef_RawFastCallKeywords()
>>>        which calls
>>> e. the actual C function (*ml_meth)()
>>>
>>> I think it's more useful to say that both PEPs bridge a->e (via
>>> _Py_VectorCall or PyCCall_Call).
>>
>> Not quite. For a builtin_function_or_method, we have with PEP 580:
>>
>> a. call_function()
>>      calls
>> d. PyCCall_FastCall
>>      which calls
>> e. the actual C function
>>
>> and with PEP 590 it's more like:
>>
>> a. call_function()
>>      calls
>> c. _PyCFunction_FastCallKeywords
>>      which calls
>> d. _PyMethodDef_RawFastCallKeywords
>>      which calls
>> e. the actual C function
>>
>> Level c. above is the vectorcall wrapper, which is a level that PEP 
>> 580 doesn't have.
> 
> PEP 580 optimizes all the code paths, where PEP 590 optimizes the fast 
> path, and makes sure most/all use cases can use (or switch to) the fast 
> path. > Both fast paths are fast: bridging a->e using zero-copy arg passing with
> some C calls and flag checks.
> 
> The PEP 580 approach is faster; PEP 590's is simpler.

Why do you say that PEP 580's approach is faster? There is no evidence 
for this.
The only evidence so far is a couple of contrived benchmarks. Jeroen's 
showed a ~1% speedup for PEP 580 and mine showed a ~30% speed up for PEP 
590.
This clearly shows that I am better and coming up with contrived 
benchmarks :)

PEP 590 was chosen as the fastest protocol I could come up with that was 
fully general, and wasn't so complex as to be unusable.

> 
> 
>>> Jeroen, is there something in PEPs 579/580 that PEP 590 blocks, or
>>> should address?
>>
>> Well, PEP 580 is an extensible protocol while PEP 590 is not. But, 
>> PyTypeObject is extensible, so even with PEP 590 one can always extend 
>> that (for example, PEP 590 uses a type flag 
>> Py_TPFLAGS_METHOD_DESCRIPTOR where PEP 580 instead uses the structs 
>> for the C call protocol). But I guess that extending PyTypeObject will 
>> be harder to justify (say, in a future PEP) than extending the C call 
>> protocol.
> 
> That's a good point.

Saying that PEP 590 is not extensible is true, but misleading.
PEP 590 is fully universal, it supports callables that can do anything 
with anything. There is no need for it to be extended because it already 
supports any possible behaviour.

Cheers,
Mark.


More information about the Python-Dev mailing list