[Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html
Jim Jewett
jimjjewett at gmail.com
Mon Jan 2 01:37:26 CET 2012
In http://mail.python.org/pipermail/python-dev/2011-December/115172.html,
P. J. Eby wrote:
> On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>> While the dictionary probe has to start with a hash for backward
>> compatibility reasons, is there a reason the overflow strategy for
>> insertion has to be buckets containing lists? How about
>> double-hashing, etc?
> This won't help, because the keys still have the same hash value. ANYTHING
> you do to them after they're generated will result in them still colliding.
> The *only* thing that works is to change the hash function in such a way
> that the strings end up with different hashes in the first place.
> Otherwise, you'll still end up with (deliberate) collisions.
Well, there is nothing wrong with switching to a different hash function after N
collisions, rather than "in the first place". The perturbation
effectively does by
shoving the high-order bits through the part of the hash that survives the mask.
> (Well, technically, you could use trees or some other O log n data
> structure as a fallback once you have too many collisions, for some value
> of "too many". Seems a bit wasteful for the purpose, though.)
Your WSGI specification < http://www.python.org/dev/peps/pep-0333/ > requires
using a real dictionary for compatibility; storing some of the values
outside the
values array would violate that. Do you consider that obsolete?
-jJ
More information about the Python-Dev
mailing list