[issue25823] Speed-up oparg decoding on little-endian machines
Armin Rigo
report at bugs.python.org
Sat Dec 12 04:53:21 EST 2015
Armin Rigo added the comment:
Fwiw, I made a trivial benchmark in C that loads aligned and misaligned shorts ( http://paste.pound-python.org/show/HwnbCI3Pqsj8bx25Yfwp/ ). It shows that the memcpy() version takes only 65% of the time taken by the two-bytes-loaded version on a 2010 laptop. It takes 75% of the time on a modern server. On a recent little-endian PowerPC machine, 96%. On aarch64, only 45% faster (i.e. more than twice faster). This is all with gcc. It seems that using memcpy() is definitely a win nowadays.
----------
nosy: +arigo
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25823>
_______________________________________
More information about the Python-bugs-list
mailing list