[Python-Dev] unicode hell/mixing str and unicode as dictionary keys
Ralf Schmitt
ralf at brainbot.com
Thu Aug 3 17:47:37 CEST 2006
Ralf Schmitt wrote:
> Still trying to port our software. here's another thing I noticed:
>
> d = {}
> d[u'm\xe1s'] = 1
> d['m\xe1s'] = 1
> print d
>
> With python 2.4 I can add those two keys to the dictionary and get:
> $ python2.4 t2.py
> {u'm\xe1s': 1, 'm\xe1s': 1}
>
> With python 2.5 I get:
>
> $ python2.5 t2.py
> Traceback (most recent call last):
> File "t2.py", line 3, in <module>
> d['m\xe1s'] = 1
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1:
> ordinal not in range(128)
>
> Is this intended behaviour? I guess this might break lots of programs
> and the way python 2.4 works looks right to me.
> I think it should be possible to mix str/unicode keys in dicts and let
> non-ascii strings compare not-equal to any unicode string.
Also this behaviour makes your programs break randomly, that is, it will
break when the string you add hashes to the same value that the unicode
string has (at least that's what I guess..)
- Ralf
More information about the Python-Dev
mailing list