[issue17628] str==str: compare the first and last character before calling memcmp()

Marc-Andre Lemburg report at bugs.python.org
Thu Apr 4 19:30:09 CEST 2013


Marc-Andre Lemburg added the comment:

On 04.04.2013 19:00, Eric Snow wrote:
> 
> Eric Snow added the comment:
> 
>> Marc-Andre Lemburg added the comment:
>> Same here. The heuristic may work for short strings that easily fit
>> into the CPU cache, but as soon as you use it on longer strings,
>> this will result in much slower comparisons.
> 
> When testing both, would it help to test the end of the string before the beginning?  I'd expect that be more likely to leave the beginning in the cache for any subsequent memcmp() call.

Again: this depends a lot on what strings you are dealing with. If
you are comparing strings that only vary in the first few characters,
testing the last character first would not be ideal :-)

Given that CPUs are optimized to read ahead in memory, it's always
better to avoid jumping around too much when accessing memory.

http://en.wikipedia.org/wiki/CPU_cache
http://en.wikipedia.org/wiki/Locality_of_reference
http://lwn.net/Articles/252125/

Ideally, you want to stay within a cache line, typically 64 bytes.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue17628>
_______________________________________


More information about the Python-bugs-list mailing list