[issue17628] str==str: compare the first and last character before calling memcmp()

Thu Apr 4 10:33:11 CEST 2013

STINNER Victor added the comment:

> In other words, I'm not convinced this is a useful heuristic.

Me neither, but we should use the same optimization strategy for all
functions. If we don't compare first and/or last character for
str==str, we should do the same for bytes==bytes and Py_UNICODE_MATCH.

str==str performances depends on the compiler and the libc. So
performances may be very different on Windows, I will try to run the
benchmark on Windows.

GCC has also a known performance issue on memcmp. Its builtin memcmp
implementation is slower than glibc >= 2.10, especiall glibc >= 2.13.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43052
"GCC can't beat glibc if function call overhead is low."

2013/4/4 Antoine Pitrou <report at bugs.python.org>:
>
> Antoine Pitrou added the comment:
>
>> I don't understand why the patch makes the comparaison much slower,
>> since most time is supposed to be spend in memcmp()?
>
> Because reading the last character evicts useful data from the CPU cache, just before memcmp() reads it again from memory?
>
> In other words, I'm not convinced this is a useful heuristic.
>
> ----------
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <http://bugs.python.org/issue17628>
> _______________________________________

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17628>
_______________________________________