[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly
STINNER Victor
report at bugs.python.org
Fri Jun 21 17:06:40 EDT 2019
STINNER Victor <vstinner at redhat.com> added the comment:
> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 1: invalid continuation byte
Python is right: b'f\xf1\xf6rd' is not a valid UTF-8 string:
$ python3
Python 3.7.3 (default, May 11 2019, 00:38:04)
>>> b'f\xf1\xf6rd'.decode()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 1: invalid continuation byte
This change is deliberate: it makes UTF-8 incremental decoder correct (respect the UTF-8 standard). I close the issue.
----------
resolution: -> fixed
status: open -> closed
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue24214>
_______________________________________
More information about the Python-bugs-list
mailing list