[issue24214] UTF-8 incremental decoder doesn't support surrogatepass correctly

Serhiy Storchaka report at bugs.python.org
Sat Jun 22 06:11:05 EDT 2019


Serhiy Storchaka <storchaka+cpython at gmail.com> added the comment:

Victor, I think you misunderstood the issue. The problem is not that a decoding error is raised. The problem is that the incremental decoder no longer raises where it raised before.

I think that both behavior may be correct, and that it is better to not rely on ability of the incremental decoder with final=False to detect an invalid encoded data, but I see now that it is possible to fix for the original issue more carefully, without changing that behavior. PR 14304 does this.

It also change the UTF-16 incremental decoder with the surrogatepass error handler to return a non-empty data when decode a low surrogate with final=False. It is not necessary, but it makes all UTF-* decoders consistent and makes tests simpler.

----------
resolution: fixed -> 
stage: resolved -> patch review
status: closed -> open
versions: +Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue24214>
_______________________________________


More information about the Python-bugs-list mailing list