[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly
Karthikeyan Singaravelan
report at bugs.python.org
Thu Jun 20 14:24:03 EDT 2019
Karthikeyan Singaravelan <tir.karthi at gmail.com> added the comment:
This change seems to have caused test failure reported in https://github.com/python-hyper/wsproto/issues/126
from codecs import getincrementaldecoder
decoder = getincrementaldecoder("utf-8")()
print(decoder.decode(b'f\xf1\xf6rd', False))
# With this commit 7a465cb5ee
➜ cpython git:(7a465cb5ee) ./python.exe /tmp/foo.py
f
Before 7a465cb5ee
➜ cpython git:(38f4e468d4) ./python.exe /tmp/foo.py
Traceback (most recent call last):
File "/tmp/foo.py", line 3, in <module>
print(decoder.decode(b'f\xf1\xf6rd', False))
File "/Users/karthikeyansingaravelan/stuff/python/cpython/Lib/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf1 in position 1: invalid continuation byte
----------
nosy: +xtreak
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue24214>
_______________________________________
More information about the Python-bugs-list
mailing list