[Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

Paul Colomiets pc at gafol.net
Fri Aug 4 20:31:38 CEST 2006


Hi!

Terry Reedy wrote:
> The fundamental axiom of sets and hence of dict keys is that any 
> object/value either is or is not a member (at any given time for 'mutable' 
> set collections).  This requires that testing an object for possible 
> membership by equality give a clean True or False answer.
>   
Yes this makes sense. But returning to dictionaries for python newbies, 
it will be strange why this
 >>> d = { u'abc': 1, u'ab\xe8': 2}
 >>> d['abc']
 >1
works as expected, but this
 >>> d['ab\xe8']
raises an exception.

Another good argument pronounced by M.-A. Lemburg:
> What's making this particular case interesting is that
> the comparison is hidden in the dictionary implementation
> and only triggers if you get a hash collision, which makes
> the whole issue appear to be happening randomly.
>
> This whole thread aside: it's never recommended to mix strings
> and Unicode, unless you really have to.
...
 >How about generating a warning instead and then go for the exception
 >in 2.6 ?

Well it's not recomended to mix strings and unicode in the dictionaries 
but if we mix for example integer and float we have the same thing. It 
doesn't raise exception but still it is not expected behavior for me:
 >>> d = { 1.0: 10, 2.0: 20 }
then if i somewhere later do:
 >>> d[1] = 100
 >>> d[2] = 200
to have here all floats in d.keys(). May be this is not a best example. 
So if you generate a warning, it should be generated every time when 
there are keys of different types inserted into dict. May be python 
should check type of the key after collision and before testing for 
equality? So the 1 and 1.0 is different as u'a' and 'a' also different. 
It even can give some perfomance overhead I think.

--
Regards,
  Paul.


More information about the Python-Dev mailing list