[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces

Thomas Breuel tmbdev at gmail.com
Wed Apr 29 11:19:01 CEST 2009


> Sure. However, that requires you to provide meaningful, reproducible
> counter-examples, rather than a stenographic formulation that might
> hint some problem you apparently see (which I believe is just not
> there).


Well, here's another one: PEP 383 would disallow UTF-8 encodings of half
surrogates.  But such encodings are currently supported by Python, and they
are used as part of CESU-8 coding.  That's, in fact, a common way of
converting UTF-16 to UTF-8.  How are you going to deal with existing code
that relies on being able to code half surrogates as UTF-8?

Tom
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20090429/1ffd6914/attachment.htm>


More information about the Python-Dev mailing list