[Python-Dev] PEP 383 and GUI libraries
Terry Reedy
tjreedy at udel.edu
Fri May 1 22:21:36 CEST 2009
Zooko O'Whielacronx wrote:
> Following-up to my own post to correct a major error:
> Is it true that
> srcbytes.encode(srcencoding, 'python-escape').decode('utf-8',
> 'python-escape') will always produce srcbytes ? That is my Requirement
If you start with bytes, decode with utf-8b to unicode (possibly
'invalid'), and encode the result back to bytes with utf-8b, you should
get the original bytes, regardless of what they were. That is the point
of PEP 383 -- to reliably roundtrip file 'names' that start as bytes and
must end as the same bytes but which may not otherwise have a unicode
decoding.
If you start with invalid unicode text, encode to bytes with utf-8b, and
decode back to unicode, you might instead get a different and valid
unicode text. An example was given in the discussion. I believe this
would be hard to avoid. An any case, it does not matter for the use
case of starting with bytes that one wants to temporarily but surely
work with as text.
Terry Jan Reedy
More information about the Python-Dev
mailing list