[Python-Dev] Why not using the hash when comparing strings?
Duncan Booth
duncan.booth at suttoncourtenay.org.uk
Fri Oct 19 12:02:19 CEST 2012
Hrvoje Niksic <hrvoje.niksic at avl.com> wrote:
> On 10/19/2012 03:22 AM, Benjamin Peterson wrote:
>> It would be interesting to see how common it is for strings which have
>> their hash computed to be compared.
>
> Since all identifier-like strings mentioned in Python are interned, and
> therefore have had their hash computed, I would imagine comparing them
> to be fairly common. After all, strings are often used as makeshift
> enums in Python.
>
> On the flip side, those strings are typically small, so a measurable
> overall speed improvement brought by such a change seems unlikely.
I'm pretty sure it would result in a small slowdown.
Many (most?) of the comparisons against interned identifiers will be done
by dictionary lookups and the dictionary lookup code only tries the string
comparison after it has determined that the hashes match. The only time
dictionary key strings contents are actually compared is when the hash
matches but the pointers are different; it is already the case that if the
hashes don't match the strings are never compared.
More information about the Python-Dev
mailing list