[issue19539] The 'raw_unicode_escape' codec buggy + not apropriate for Python 3.x
Serhiy Storchaka
report at bugs.python.org
Sun Nov 10 08:05:29 CET 2013
Serhiy Storchaka added the comment:
The 'raw_unicode_escape' codec can't be neither removed nor changed because it is used in pickle protocol. Just don't use it if its behavior looks weird for you.
Right way to decode raw_unicode_escape-encoded data is use 'raw_unicode_escape' decoder.
If a string don't contain quotes, you can use eval(), but you should first decode data from latin1 and encode to UTF-8:
>>> literal = ('r"%s"' % "zażółć".encode('raw_unicode_escape').decode('latin1')).encode()
>>> literal
b'r"za\\u017c\xc3\xb3\\u0142\\u0107"'
>>> eval(literal)
'za\\u017có\\u0142\\u0107'
----------
nosy: +serhiy.storchaka
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue19539>
_______________________________________
More information about the Python-bugs-list
mailing list