[Python-Dev] Benchmarks why we need PEP 576/579/580

Mon Jul 23 07:13:02 EDT 2018

On 2018-07-23 01:54, Ivan Pozdeev via Python-Dev wrote:
> All the material to discuss that we have in this thread is a single test
> result that's impossible to reproduce and impossible to run in Py3.

I just posted that it can be reproduced on Python 3.7:
https://mail.python.org/pipermail/python-dev/2018-July/154740.html

I admit that it's not entirely trivial to do that. The Python 3 port of 
SageMath is still work in progress and the Python 3.7 port even more so. 
So it requires a few patches. If somebody wants to reproduce those 
results right now, I could give more details. But really, I would 
recommend to wait a month or so and then hopefully those patches will be 
merged.

> It's however impossible to say from this
> how frequent these scenarios are in practice

And how would you suggest that we measure that? All benchmarks are 
artificial in some way: for every benchmark, one can find reasons why 
it's not relevant.

> and how consistent the improvement is among them.

I only posted the most serious regression. As another data point, the 
total time to run the full SageMath testsuite increased by about 1.8% 
when compiling the Cython code with binding=True. So one could assume 
that there is an average improvement of 1.8% with a much larger 
improvement in a few specific cases.

> Likewise, it's impossible to say anything
> about the complexity the changes will reduce/introduce without a
> proof-of-concept implementation.

Why do you think that there is no implementation? As mentioned in PEP 
580, there is an implementation at
https://github.com/jdemeyer/cpython/tree/pep580

Jeroen.