[Python-Dev] Comparing PEP 576 and PEP 580

Sat Jul 7 09:34:44 EDT 2018

On 7 July 2018 at 07:12, Guido van Rossum <guido at python.org> wrote:
> On Fri, Jul 6, 2018 at 2:52 AM Jeroen Demeyer <J.Demeyer at ugent.be> wrote:
>> The Cython developers (in particular Stefan Behnel) certainly support my
>> work. I have talked with them in person at a workshop and they posted a
>> few emails to python-dev and they also gave me some personal comments
>> about PEP 580.
>
>
> And how do they feel about PEP 576? I'd like to see some actual debate of
> the pros and cons of the details of PEP 576 vs. PEP 580. So far I mostly see
> you and INADA Naoki disagreeing about process, which doesn't give me the
> feeling that there is consensus.

I think part of the confusion here stems from the fact that Jeroen's
original PEP 575 tried to cover a lot of different things, with two of
the most notable being:

1. Getting functions implemented in C to act more like their Python
counterparts from an introspection and capability perspective (e.g.
having access to their defining module regardless of whether they're a
top level function or a method on a type definition)
2. Allowing third party compilers like Cython to route function calls
through the CPython C API more efficiently than the existing public
APIs that require building and deconstructing Python tuples and dicts

Hence the request that the PEP be split up into an overview PEP
describing the problem space (now available as
https://www.python.org/dev/peps/pep-0579/ ), and then follow-up PEPs
targeting specific sub-topics within that PEP.

That's happened for PEP 580 (since Jeroen was working on both PEPs at
the same time as an initial replacement for PEP 575), but PEP 576
hasn't been updated yet to specify which of the subtopics within PEP
579 it is aiming to address

My current reading is that PEP 576 isn't really targeting the same
aspects as PEP 580: PEP 580 aims to allow third party callable
implementations to be as fast as native CPython internal ones
regardless of which callable type they use (point 2 above), while PEP
576 has the more modest aim of eliminating some of the current reasons
that third parties find it necessary to avoid using the CPython native
callable types in the first place (point 1 above).

That said, I think Inada-san's request for benchmarks that clearly
demonstrate the challenges with the status quo and hence can be used
to quantify the potential benefits is a sound one, as those same
benchmarks can then be used to assess the complexity of adapting
existing third party tools and libraries like Cython and NumPy to
implement the proposals in order to produce updated benchmarking
numbers (for PEP 580, add code to implement the new protocol method,
for PEP 576, switch to inheriting from one of the two newly defined C
level types).

At a micro-benchmark level, that would probably involve just comparing
mapping builtin functions and methods over a rangewith the performance
of variants of those functions implemented using only the public C API
(the specific functions and methods chosen for the benchmark will need
to be those where the optimisations for particularly simple function
signatures don't apply).

At a macro-benchmark level, it would likely require first choosing or
defining a language level computational performance benchmark that's
based on real world code using libraries like
NumPy/pandas/scikit-learn, akin to the domain specific benchmarks we
already have for libraries like SQL Alchemy, dulwich, and various
templating engines (django, mako, genshi).

One possibility that may make sense could be to set up comparisons of
https://github.com/numpy/numpy/tree/master/benchmarks numbers between
a conventional NumPy and one with more optimised C level calls, and
doing something similar for
https://pandas.pydata.org/pandas-docs/stable/contributing.html#running-the-performance-test-suite.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia