[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly

STINNER Victor report at bugs.python.org
Fri Jun 21 17:06:40 EDT 2019


STINNER Victor <vstinner at redhat.com> added the comment:

> UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 1: invalid continuation byte

Python is right: b'f\xf1\xf6rd' is not a valid UTF-8 string:

$ python3
Python 3.7.3 (default, May 11 2019, 00:38:04) 
>>> b'f\xf1\xf6rd'.decode()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 1: invalid continuation byte

This change is deliberate: it makes UTF-8 incremental decoder correct (respect the UTF-8 standard). I close the issue.

----------
resolution:  -> fixed
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue24214>
_______________________________________


More information about the Python-bugs-list mailing list