[Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with Pythran

Sun Mar 14 01:15:39 EST 2021

Hi Pierre,

If you’re able to compile NumPy locally and you have reliable benchmarks, you can write a script that tests the runtime of your benchmark and reports it as a test pass/fail. You can then use “git bisect run” to automatically find the commit that caused the issue. That will help narrow down the discussion before it gets completely derailed a second time. 😂

https://lwn.net/Articles/317154/

Juan. 

> On 13 Mar 2021, at 10:34 am, PIERRE AUGIER <pierre.augier at univ-grenoble-alpes.fr> wrote:
> 
> Hi,
> 
> I tried to compile Numpy with `pip install numpy==1.20.1 --no-binary numpy --force-reinstall` and I can reproduce the regression.
> 
> Good news, I was able to reproduce the difference with only Numpy 1.20.1. 
> 
> Arrays prepared with (`df` is a Pandas dataframe)
> 
> arr = df.values.copy()
> 
> or 
> 
> arr = np.ascontiguousarray(df.values)
> 
> lead to "slow" execution while arrays prepared with
> 
> arr = np.copy(df.values)
> 
> lead to faster execution.
> 
> arr.copy() or np.copy(arr) do not give the same result, with arr obtained from a Pandas dataframe with arr = df.values. It's strange because type(df.values) gives <class 'numpy.ndarray'> so I would expect arr.copy() and np.copy(arr) to give exactly the same result.
> 
> Note that I think I'm doing quite serious and reproducible benchmarks. I also checked that this regression is reproducible on another computer.
> 
> Cheers,
> 
> Pierre
> 
> ----- Mail original -----
>> De: "Sebastian Berg" <sebastian at sipsolutions.net>
>> À: "numpy-discussion" <numpy-discussion at python.org>
>> Envoyé: Vendredi 12 Mars 2021 22:50:24
>> Objet: Re: [Numpy-discussion] Looking for a difference between Numpy 0.19.5 and 0.20 explaining a perf regression with
>> Pythran
> 
>>> On Fri, 2021-03-12 at 21:36 +0100, PIERRE AUGIER wrote:
>>> Hi,
>>> 
>>> I'm looking for a difference between Numpy 0.19.5 and 0.20 which
>>> could explain a performance regression (~15 %) with Pythran.
>>> 
>>> I observe this regression with the script
>>> https://github.com/paugier/nbabel/blob/master/py/bench.py
>>> 
>>> Pythran reimplements Numpy so it is not about Numpy code for
>>> computation. However, Pythran of course uses the native array
>>> contained in a Numpy array. I'm quite sure that something has changed
>>> between Numpy 0.19.5 and 0.20 (or between the corresponding wheels?)
>>> since I don't get the same performance with Numpy 0.20. I checked
>>> that the values in the arrays are the same and that the flags
>>> characterizing the arrays are also the same.
>>> 
>>> Good news, I'm now able to obtain the performance difference just
>>> with Numpy 0.19.5. In this code, I load the data with Pandas and need
>>> to prepare contiguous Numpy arrays to give them to Pythran. With
>>> Numpy 0.19.5, if I use np.copy I get better performance that with
>>> np.ascontiguousarray. With Numpy 0.20, both functions create array
>>> giving the same performance with Pythran (again, less good that with
>>> Numpy 0.19.5).
>>> 
>>> Note that this code is very efficient (more that 100 times faster
>>> than using Numpy), so I guess that things like alignment or memory
>>> location can lead to such difference.
>>> 
>>> More details in this issue
>>> https://github.com/serge-sans-paille/pythran/issues/1735
>>> 
>>> Any help to understand what has changed would be greatly appreciated!
>>> 
>> 
>> If you want to really dig into this, it would be good to do profiling
>> to find out at where the differences are.
>> 
>> Without that, I don't have much appetite to investigate personally. The
>> reason is that fluctuations of ~30% (or even much more) when running
>> the NumPy benchmarks are very common.
>> 
>> I am not aware of an immediate change in NumPy, especially since you
>> are talking pythran, and only the memory space or the interface code
>> should matter.
>> As to the interface code... I would expect it to be quite a bit faster,
>> not slower.
>> There was no change around data allocation, so at best what you are
>> seeing is a different pattern in how the "small array cache" ends up
>> being used.
>> 
>> 
>> Unfortunately, getting stable benchmarks that reflect code changes
>> exactly is tough...  Here is a nice blog post from Victor Stinner where
>> he had to go as far as using "profile guided compilation" to avoid
>> fluctuations:
>> 
>> https://vstinner.github.io/journey-to-stable-benchmark-deadcode.html
>> 
>> I somewhat hope that this is also the reason for the huge fluctuations
>> we see in the NumPy benchmarks due to absolutely unrelated code
>> changes.
>> But I did not have the energy to try it (and a probably fixed bug in
>> gcc makes it a bit harder right now).
>> 
>> Cheers,
>> 
>> Sebastian
>> 
>> 
>> 
>> 
>>> Cheers,
>>> Pierre
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at python.org
>>> https://mail.python.org/mailman/listinfo/numpy-discussion
>>> 
>> 
>> 
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at python.org
>> https://mail.python.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210314/b0165e0b/attachment.html>