Why is dictionary.keys() a list and not a set?

Thu Nov 24 13:05:20 EST 2005

Christoph Zwerschke <cito at online.de> wrote:

> Mike Meyer wrote:
> > Are you sure dict.values() should be a set? After all, values aren't
> > guaranteed to be unique, so dict(a = 1, b = 1).values() currently
> > returns [1, 1], but would return set([1]) under your proposal.
> 
> Good point. Contrary to the keys, the values are not unique. Still, it
> would make sense to only return the set (the value set of the mapping)
> because this is usually what you are interested in and you'd think the

Absolutely not in my use cases.  The typical reason I'm asking for
.values() is because each value is a count (e.g. of how many time the
key has occurred) and I want the total, so the typical idiom is
sum(d.values()) -- getting a result set would be an utter disaster, and
should I ever want it, set(d.values()) is absolutely trivial to code.

Note that in both cases I don't need a LIST, just an ITERATOR, so
itervalues would do just as well (and probably faster) except it looks
more cluttered and by a tiny margin less readable -- "the sum of the
values" rolls off the tongue, "the sum of the itervalues" doesn't!-)
So, the direction of change for Python 3000, when backwards
compatibility can be broken, is to return iterators rather than lists.
At that time, should you ever want a list, you'll say so explicitly, as
you do now when you want a set, i.e. list(d.values())

> >>For instance, by allowing the set operator "in" for dictionaries,
> >>instead of "has_key".
> > "in" already works for dicdtionaries:
> 
> I know. I wanted to use it as an example where set operations have 
> already been made available for dictionaries.
> 
> I know, 'in' has existed for lists and tuples long before sets, but 
> actually it is a native set operation.

Historically, the 'in' operator was added to dictionaries (and the
special method __contains__ introduced, to let you support containment
checking in your own types) well before sets were added to Python (first
as a standard library module, only later as a built-in).

In Python today 'in' doesn't necessarily mean set membership, but some
fuzzier notion of "presence in container"; e..g., you can code

'zap' in 'bazapper'

and get the result True (while 'paz' in 'bazapper' would be False, even
though, if you thought of the strings as SETS rather than SEQUENCES of
characters, that would be absurd).  So far, strings (plain and unicode)
are the only containers that read 'in' this way (presence of subsequence
rather than of single item), but one example should be enough to show
that "set membership" isn't exactly what the 'in' operator means.

>  From a mathematical point of view, it would have been nice if Python
> had defined "set" as a basic data type in the beginning, with lists, 
> tuples, dictionaries as derived data types.

You appear to have a strange notion of "derived data type".  In what
sense, for example, would a list BE-A set?  It breaks all kind of class
invariants, e.g. "number of items is identical to number of distinct
items", commutativity of addition, etc..

Alex