[Python-Dev] http://mail.python.org/pipermail/python-dev/2011-December/115172.html

PJ Eby pje at telecommunity.com
Mon Jan 2 04:00:33 CET 2012


On Sun, Jan 1, 2012 at 7:37 PM, Jim Jewett <jimjjewett at gmail.com> wrote:

> Well, there is nothing wrong with switching to a different hash function
> after N
> collisions, rather than "in the first place".  The perturbation
> effectively does by
> shoving the high-order bits through the part of the hash that survives the
> mask.
>

Since these are true hash collisions, they will all have the same high
order bits.  So, the usefulness of the perturbation is limited mainly to
the common case where true collisions are rare.


> (Well, technically, you could use trees or some other O log n data
> > structure as a fallback once you have too many collisions, for some value
> > of "too many".  Seems a bit wasteful for the purpose, though.)
>
> Your WSGI specification < http://www.python.org/dev/peps/pep-0333/ >
> requires
> using a real dictionary for compatibility; storing some of the values
> outside the
> values array would violate that.


When I said "use some other data structure", I was referring to the
internal implementation of the dict type, not to user code.  The only
user-visible difference (even at C API level) would be the order of keys()
et al.  (In any case, I still assume this is too costly an implementation
change compared to changing the hash function or seeding it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20120101/83f246dc/attachment.html>


More information about the Python-Dev mailing list