[Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
Glenn Linderman
v+python at g.nevcal.com
Tue Apr 28 20:48:37 CEST 2009
On approximately 4/28/2009 10:00 AM, came the following characters from
the keyboard of Martin v. Löwis:
> An alternative that doesn't suffer from the risk of not being able to
> store decoded strings would have been the use of PUA characters, but
> people rejected it because of the potential ambiguities. So they clearly
> dislike one risk more than the other. UTF-8b is primarily meant as
> an in-memory representation.
The UTF-8b representation suffers from the same potential ambiguities as
the PUA characters... perhaps slightly less likely in practice, due to
the use of Unicode-illegal characters, but exactly the same theoretical
likelihood in the space of Python-acceptable character codes.
--
Glenn -- http://nevcal.com/
===========================
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking
More information about the Python-Dev
mailing list