hashing mutable instances (was Re: delete duplicates in list)

Michael Hudson mwh at python.net
Thu Oct 30 06:55:23 EST 2003


Thomas Heller <theller at python.net> writes:

> Michael Hudson <mwh at python.net> writes:
> 
> > Alex Martelli <aleax at aleax.it> writes:
> >
> >> > Actually the Set in sets module has an average lookup of O(1), worst
> >> > case O(n) (not 100% sure of worst case, but 99% sure). It's been
> >> 
> >> Hmmm -- could you give an example of that worstcase...?  a _full_
> >> hashtable would give such behavior, but Python's dicts always ensure
> >> the underlying hashtables aren't too full...
> >
> > Try inserting a bunch of instances of 
> >
> > class C:
> >     def __hash__(self): return 0
> >
> > into a dictionary.
> 
> I've though about using something like this in production code
> to be able to store mutable instances in a dict.

Eek!

> Performance problems aside (since there are only a couple of key/value
> pairs in the dict), is it such a bad idea?

Um, I don't know actually.  It pretty much defeats the point of using
a hashtable.  If your class has some non-mutable parts it'd be an
idea to hash them, of course.

And the performance problems are severe -- inserting just a thousand
instances of the class C() into a dictionary takes noticable time.

(What *is* a bad idea is giving such classes an __eq__ method that
mutates the dict being inserted into -- I found a bunch of such bugs a
couple years back and I think it's still possible to core Python (via
stack overflow) by being even more devious).

Cheers,
mwh

-- 
  Our Constitution never promised us a good or efficient government,
  just a representative one. And that's what we got.
      -- http://www.advogato.org/person/mrorganic/diary.html?start=109




More information about the Python-list mailing list