[issue12892] UTF-16 and UTF-32 codecs should reject (lone) surrogates
Ezio Melotti
report at bugs.python.org
Mon Jan 30 09:51:06 CET 2012
Ezio Melotti <ezio.melotti at gmail.com> added the comment:
Thanks for the patch!
> * fix an error in the error handler for utf-16-le. (In, Python3.2
> b'\xdc\x80\x00\x41'.decode('utf-16-be', 'ignore') returns "\x00"
> instead of "A" for some reason)
This should probably be done on a separate patch that will be applied to 3.2/3.3 (assuming that it can go to 3.2). Rejecting surrogates will go in 3.3 only. (Note that lot of Unicode-related code changed between 3.2 and 3.3.)
> Should we really reject lone surrogates for UTF-7?
No, I meant only UTF-8/16/32; UTF-7 is fine as is.
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12892>
_______________________________________
More information about the Python-bugs-list
mailing list