hash()
Raymond Hettinger
python at rcn.com
Mon Dec 5 14:38:04 EST 2005
[John Marshall]
> For strings of > 1 character, what are the chances
> that hash(st) and hash(st[::-1]) would return the
> same value?
Python's string hash algorithm is non-commutative, so a collision with
a reversed string is not likely. The exact answer depends on the
population of strings being hashed, but it's not hard to compute
collision statistics for a sampling of those strings:
collisions = len(sample) - len(set(hash(s) for s in sample))
FWIW, here is how Python computes string hash values:
static long
string_hash(PyStringObject *a)
{
register int len;
register unsigned char *p;
register long x;
len = a->ob_size;
p = (unsigned char *) a->ob_sval;
x = *p << 7;
while (--len >= 0)
x = (1000003*x) ^ *p++;
x ^= a->ob_size;
if (x == -1)
x = -2;
return x;
}
> My goal is to uniquely identify multicharacter strings,
> all of which begin with "/" and never end with "/".
> Therefore, st != st[::-1].
Just use a set -- no string reversal is needed for detection of unique
multicharacter strings..
Raymond
More information about the Python-list
mailing list