[Python-Dev] unicode hell/mixing str and unicode as dictionary keys
Ralf Schmitt
ralf at brainbot.com
Thu Aug 3 17:40:57 CEST 2006
Still trying to port our software. here's another thing I noticed:
d = {}
d[u'm\xe1s'] = 1
d['m\xe1s'] = 1
print d
With python 2.4 I can add those two keys to the dictionary and get:
$ python2.4 t2.py
{u'm\xe1s': 1, 'm\xe1s': 1}
With python 2.5 I get:
$ python2.5 t2.py
Traceback (most recent call last):
File "t2.py", line 3, in <module>
d['m\xe1s'] = 1
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1:
ordinal not in range(128)
Is this intended behaviour? I guess this might break lots of programs
and the way python 2.4 works looks right to me.
I think it should be possible to mix str/unicode keys in dicts and let
non-ascii strings compare not-equal to any unicode string.
At least it should be documented prominently in the "what's new" document.
- Ralf
More information about the Python-Dev
mailing list