Dictionary .keys() and .values() should return a set [with Python 3000 in mind]

Sun Jul 2 02:38:47 EDT 2006

<vatamane at gmail.com> wrote in message
news:1151784117.119515.310100 at 75g2000cwc.googlegroups.com...
> This has been bothering me for a while. Just want to find out if it
> just me or perhaps others have thought of this too: Why shouldn't the
> keyset of a dictionary be represented as a set instead of a list?

I think this is an interesting suggestion.  Of course, the current situation
is as much a product of historical progression as anything: lists and dicts
pre-date sets, so the collection of keys had to be returned as a list.
Since lists also carry some concept of order to them, the documentation for
the list returned by dict.keys() went out of its way to say that the order
of the elements in the dict.keys() list had no bearing on the dict, the
insertion order of entries, or anything else, that the order of keys was
purely arbitrary.

In fact, there is not a little irony to this proposal, since it seems it was
just a few months ago that c.l.py had just about weekly requests for how to
create an "ordered dict," with various ideas of how a dict should be
ordered, but most intended to preserve the order of insertion of items into
the dict.  And now here we have just about the opposite proposal - dicts
should not only *not* be ordered, they should revel in their disorderliness.

I liked the example in which the OP (of this thread) wanted to compare the
keys from two different dicts, for equality of keys.  Since the keys()
method returns a set of indeterminate order, we can't simply perform
dictA.keys() == dictB.keys().  But how often does this really happen?  In
practice, I think the keys of a dict, when this collection is used at all,
are usually sorted and then iterated over, usually to prettyprint the keys
and values in the dict.  Internally, this set of items shouldn't even exist
as a separate data structure - the dict's keys are merely labels on nodes in
some sort of hash tree.

Now that we really do have sets in Python, if one were to design dicts all
over again, it does seem like set would be a better choice than list for the
type returned by the keys() method.  To preserve compatibility, would a
second method, keyset(), do the trick?  The OP used this term himself,
referring to "the keyset of the dictionary".  Still given the few cases
where I would access this as a true set, using "set(dictA.keys())" doesn't
seem to be that big a hardship, and its obviousness will probably undercut
any push for a separate keyset() method.

-- Paul