[issue32534] Speed-up list.insert: use memmove()

Jeethu Rao report at bugs.python.org
Tue Jan 16 19:07:53 EST 2018


Jeethu Rao <jeethu at jeethurao.com> added the comment:

Victor: I’m booting with the isolcpus and rcu_nocbs flags, and running pyperformance with the --affinity flag to pin the benchmark to the isolated CPU cores. I’ve also run `perf system tune`. And the OS is Ubuntu 17.10. Thanks for the tip on using perf timeit instead of timeit. I’ve run the benchmark that you've suggested with a minor change (to avoid the cost of LOAD_ATTR) and attached the output on a gist[1].

Antoine: Thanks for benchmarking it. After looking at the generated assembly[2], I found that ins1 is being inlined and the call to memmove was appearing before the loop (possibly because the compiler assumes that the call to memmove is more likely). I made a minor change and increased the threshold to 32. I’ve attached the generated assembly in a gist[3] (The relevant sequence is around line 8406, if you’re interested). And here’s the pyperformance comparison[4]. Could you please try benchmarking this version on your machine?

[1]: https://gist.github.com/jeethu/2d2de55afdb8ea4ad03b6a5d04d5227f
[2]: Generated with “gcc -DNDEBUG -fwrapv -O3 -std=c99  -I. -I./Include -DPy_BUILD_CORE -S -masm=intel Objects/listobject.c”
[3]: https://gist.github.com/jeethu/596bfc1251590bc51cc230046b52fb38
[4]: https://gist.github.com/jeethu/d6e4045f7932136548a806380eddd030

----------

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue32534>
_______________________________________


More information about the Python-bugs-list mailing list