[issue18713] Enable surrogateescape on stdin and stdout when appropriate
STINNER Victor
report at bugs.python.org
Thu Aug 22 15:18:23 CEST 2013
STINNER Victor added the comment:
> The surrogateescape error handler is dangerous with utf-16/32. It can produce globally invalid output.
I don't understand, can you give an example? surrogateescape generate invalid encoded string with any encoding. Example with UTF-8:
>>> b"a\xffb".decode("utf-8", "surrogateescape")
'a\udcffb'
>>> 'a\udcffb'.encode("utf-8", "surrogateescape")
b'a\xffb'
>>> b'a\xffb'.decode("utf-8")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 1: invalid start byte
So str.encode("utf-8", "surrogateescape") produces an invalid UTF-8 sequence.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18713>
_______________________________________
More information about the Python-bugs-list
mailing list