Why is dictionary.keys() a list and not a set?

Christoph Zwerschke cito at online.de
Sat Nov 26 05:49:50 EST 2005


Mike Meyer wrote:

 > I think you're trying to tweak the wrong definition. Types are
 > immutable if their instances are immutable.

I'm not trying to tweak it, but to gain more clarity about what is 
actually meant when you talk about "mutable" types and objects and 
whether it suffices to use these terms as long as you speak about 
built-in datatypes.

Tuples have been considered an immutable type since generations, becaue 
the elements of a tuple cannot be changed (i.e. the elements themselves, 
not their values). Likewise, they should be considered a hashable type 
because they have a proper hash function that can be used if all 
elements are hashable. Still, you can create instances of this 
immutable/hashable data type which are mutable/not hashable. So in the 
docs it will be important to be clear and emphasize that the *objects* 
have to be immutable/hashable, not only their types.

 > So whether or not a tuple containing a mutable object is
 > immutable depends on whether you define a immutable as "the object
 > can't change" or "the objects value can't change".

Right - if you have an actual tuple instance like (a,) and a is a list, 
you can argue whether it should be considered mutable or not. But you 
cannot argue whether it is hashable or not. It's simply not hashable. In 
so far, speaking about whether an object is hashable is more precise 
(well-defined) than asking whether it is mutable, you're right.

 > Do you think it's really necessary to specify "has a __hash__ method
 > that returns a value", as opposed to just "has a __hash__ method"?

I think it is necessary to require that it has a "proper" __hash__ 
function. As far as I understand you can always add a __hash__ method. 
Not only invalid __hash__ methods like in this case:

class mylist0(list):
     def __hash__(self): return None

But you can also easily add valid __hash__ methods:

class mylist1(list):
     def __hash__(self): return 0815

class mylist2(list):
     def __hash__(self): return id(self)

In the case of mylist1, everything is ok including semantics, but 
performance suffers dramatically. In mylist2, performance is great, but 
semantics suffers greatly.

Which of these user-defined types would you call "hashable"? Technically 
speaking, both probably are, so you can indeed use list objects of both 
types as keys of a dictionary - but I think there should be a hint in 
the docs that you need to have a "proper" hash function if you want your 
dictionary to be performant and show the usually expected behavior.

-- Christoph



More information about the Python-list mailing list