[Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys
Paul Colomiets
pc at gafol.net
Fri Aug 4 20:31:38 CEST 2006
Hi!
Terry Reedy wrote:
> The fundamental axiom of sets and hence of dict keys is that any
> object/value either is or is not a member (at any given time for 'mutable'
> set collections). This requires that testing an object for possible
> membership by equality give a clean True or False answer.
>
Yes this makes sense. But returning to dictionaries for python newbies,
it will be strange why this
>>> d = { u'abc': 1, u'ab\xe8': 2}
>>> d['abc']
>1
works as expected, but this
>>> d['ab\xe8']
raises an exception.
Another good argument pronounced by M.-A. Lemburg:
> What's making this particular case interesting is that
> the comparison is hidden in the dictionary implementation
> and only triggers if you get a hash collision, which makes
> the whole issue appear to be happening randomly.
>
> This whole thread aside: it's never recommended to mix strings
> and Unicode, unless you really have to.
...
>How about generating a warning instead and then go for the exception
>in 2.6 ?
Well it's not recomended to mix strings and unicode in the dictionaries
but if we mix for example integer and float we have the same thing. It
doesn't raise exception but still it is not expected behavior for me:
>>> d = { 1.0: 10, 2.0: 20 }
then if i somewhere later do:
>>> d[1] = 100
>>> d[2] = 200
to have here all floats in d.keys(). May be this is not a best example.
So if you generate a warning, it should be generated every time when
there are keys of different types inserted into dict. May be python
should check type of the key after collision and before testing for
equality? So the 1 and 1.0 is different as u'a' and 'a' also different.
It even can give some perfomance overhead I think.
--
Regards,
Paul.
More information about the Python-Dev
mailing list