[issue30588] Missing documentation for codecs.escape_decode

Matthieu Dartiailh report at bugs.python.org
Wed Jun 7 11:36:21 EDT 2017


Matthieu Dartiailh added the comment:

The issue is that unicode_escape will not properly handle strings mixing
unicode character and escaped character as it assumes latin-1 compatible
characters only. For example, given the literal string 'Δ\nΔ', one
cannot encode using latin-1 and encoding it using utf-8 then using
unicode _escape produces a wrong output: 'Î\x94\nÎ\x94'. However using
codecs.escape_decode(r'Δ\nΔ'.encode('utf-8'))[0].decode('utf-8') gives
the proper output. Internally the Python parser handle this case but I
was unable to find where and this is the closest solution I found. I
guess it may be possible using error handlers but it seems much more
cumbersome.

Best regards

Matthieu

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue30588>
_______________________________________


More information about the Python-bugs-list mailing list