[issue19539] The 'raw_unicode_escape' codec buggy + not apropriate for Python 3.x

Serhiy Storchaka report at bugs.python.org
Sun Nov 10 08:05:29 CET 2013


Serhiy Storchaka added the comment:

The 'raw_unicode_escape' codec can't be neither removed nor changed because it is used in pickle protocol. Just don't use it if its behavior looks weird for you.

Right way to decode raw_unicode_escape-encoded data is use 'raw_unicode_escape' decoder.

If a string don't contain quotes, you can use eval(), but you should first decode data from latin1 and encode to UTF-8:

>>> literal = ('r"%s"' % "zażółć".encode('raw_unicode_escape').decode('latin1')).encode()
>>> literal
b'r"za\\u017c\xc3\xb3\\u0142\\u0107"'
>>> eval(literal)
'za\\u017có\\u0142\\u0107'

----------
nosy: +serhiy.storchaka

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19539>
_______________________________________


More information about the Python-bugs-list mailing list