[Python-Dev] unicode hell/mixing str and unicode as dictionary keys

Ralf Schmitt ralf at brainbot.com
Thu Aug 3 17:47:37 CEST 2006


Ralf Schmitt wrote:
> Still trying to port our software. here's another thing I noticed:
> 
> d = {}
> d[u'm\xe1s'] = 1
> d['m\xe1s'] = 1
> print d
> 
> With python 2.4 I can add those two keys to the dictionary and get:
> $ python2.4 t2.py
> {u'm\xe1s': 1, 'm\xe1s': 1}
> 
> With python 2.5 I get:
> 
> $ python2.5 t2.py
> Traceback (most recent call last):
>    File "t2.py", line 3, in <module>
>      d['m\xe1s'] = 1
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xe1 in position 1: 
> ordinal not in range(128)
> 
> Is this intended behaviour? I guess this might break lots of programs 
> and the way python 2.4 works looks right to me.
> I think it should be possible to mix str/unicode keys in dicts and let 
> non-ascii strings compare not-equal to any unicode string.

Also this behaviour makes your programs break randomly, that is, it will 
break when the string you add hashes to the same value that the unicode 
string has (at least that's what I guess..)

- Ralf




More information about the Python-Dev mailing list