Python 3: dict & dict.keys()

Thu Jul 25 05:44:43 EDT 2013

On Thu, 25 Jul 2013 18:15:22 +1000, Chris Angelico wrote:

> On Thu, Jul 25, 2013 at 5:27 PM, Steven D'Aprano
> <steve+comp.lang.python at pearwood.info> wrote:
>> On Thu, 25 Jul 2013 16:02:42 +1000, Chris Angelico wrote:
>>
>>> On Thu, Jul 25, 2013 at 3:48 PM, Steven D'Aprano
>>> <steve+comp.lang.python at pearwood.info> wrote:
>>>> Dicts aren't sets, and don't support set methods:
>>>>
>>>> py> d1 - d2
>>>> Traceback (most recent call last):
>>>>   File "<stdin>", line 1, in <module>
>>>> TypeError: unsupported operand type(s) for -: 'dict' and 'dict'
>>>
>>> I wouldn't take this as particularly significant, though. A future
>>> version of Python could add that support (and it might well be very
>>> useful), without breaking any of the effects of views.
>>
>> I don't think dicts can ever support set methods, since *they aren't
>> sets*. Every element consists of both a key and a value, so you have to
>> consider both. Set methods are defined in terms of singleton elements,
>> not binary elements, so before you even begin, you have to decide what
>> does it mean when two elements differ in only one of the two parts?
>>
>> Given dicts {1: 'a'}, {1: 'b'}, what is the union of them? I can see
>> five possibilities:
>>
>> {1: 'a'}
>> {1: 'b'}
>> {1: ['a', 'b']}
>> {1: set(['a', 'b'])}
>> Error
>>
>> Each of the five results may be what you want in some circumstances. It
>> would be a stupid thing for dict.union to pick one behaviour and make
>> it the One True Way to perform union on two dicts.
> 
> That's true, but we already have that issue with sets. What's the union
> of {0} and {0.0}? Python's answer: It depends on the order of the
> operands.

That's a side-effect of how numeric equality works in Python. Since 0 == 
0.0, you can't have both as keys in the same dict, or set. Indeed, the 
same numeric equality issue occurs here:

py> from fractions import Fraction
py> [0, 2.5] == [0.0, Fraction(5, 2)]
True

So nothing really to do with sets or dicts specifically. 

Aside: I think the contrary behaviour is, well, contrary. It would be 
strange and disturbing to do this:

for key in some_dict:
    if key == 0:
        print("found")
        print(some_dict[key])

and have the loop print "found" and then have the key lookup fail, but 
apparently that's how things work in Pike :-(

> I would say that Python can freely pick from the first two options you
> offered (either keep-first or keep-last), most likely the first one, and
> it'd make good sense. Your third option would be good for a few specific
> circumstances, but then you probably would also want the combination of
> {1:'a'} and {1:'a'} to be {1:['a','a']} for consistency.

Okay, that's six variations. And no, I don't think the "consistency" 
argument is right -- the idea is that you can have multiple values per 
key. Since 'a' == 'a', that's only one value, not two.

The variation using a list, versus the set, depends on whether you care 
about order or hashability.

[...]
> Raising an error would work, but is IMO unnecessary.

I believe that's the only reasonable way for a dict union method to work. 
As the Zen says:

In the face of ambiguity, refuse the temptation to guess.

Since there is ambiguity which value should be associated with the key, 
don't guess.

[...]
> A Python set already has to distinguish between object value and object
> identity; a dict simply adds a bit more distinction between
> otherwise-considered-identical keys, namely their values.

Object identity is a red herring. It would be perfectly valid for a 
Python implementation to create new instances of each element in the set 
union, assuming such creation was free of side-effects (apart from memory 
usage and time, naturally). set.union() makes no promise about the 
identity of elements, and it is defined the same way for languages where 
object identity does not exist (say, old-school Pascal).

-- 
Steven