Looking for UNICODE to ASCII Conversioni Example Code

Chris Angelico rosuav at gmail.com
Sat Oct 19 18:10:02 EDT 2013


On Sun, Oct 20, 2013 at 3:49 AM, Roy Smith <roy at panix.com> wrote:
> So, yesterday, I tracked down an uncaught exception stack in our logs to a user whose username included the unicode character 'SMILING FACE WITH SUNGLASSES' (U+1F60E).  It turns out, that's perfectly fine as a user name, except that in one obscure error code path, we try to str() it during some error processing.

How is that a problem? Surely you have to deal with non-ASCII
characters all the time - how is that particular one a problem? I'm
looking at its UTF-8 and UTF-16 representations and not seeing
anything strange, unless it's the \x0e in UTF-16 - but, again, you
must surely have had to deal with
non-ASCII-encoded-whichever-way-you-do-it.

Or are you saying that that particular error code path did NOT handle
non-ASCII characters? If so, that's a strong argument for moving to
Python 3, to get full Unicode support in _all_ branches.

ChrisA



More information about the Python-list mailing list