The reverse of encode('...', 'backslashreplace')

Duncan Booth duncan.booth at invalid.invalid
Tue Sep 4 03:27:54 EDT 2007


"Tor Erik Sønvisen" <torerik81 at gmail.com> wrote:

> How can I transform b so that the assertion holds? I.e., how can I
> reverse the backslash-replaced encoding, while retaining the str-type?
> 
>>>> a = u'‘'
>>>> b = a.encode('ascii', 'backslashreplace')
>>>> b
> '\\xe6'
>>>> assert isinstance(b, str) and b == '‘'
> 
> Traceback (most recent call last):
>   File "<pyshell#59>", line 1, in <module>
>     assert isinstance(b, str) and b == '‘'
> AssertionError
> 

The simple answer is that you cannot: the backslashreplace isn't a 
reversible operation. e.g. Try:

>>> a = u'\\xe6æ'
>>> print a
\xe6æ
>>> b = a.encode('ascii', 'backslashreplace')
>>> b
'\\xe6\\xe6'
>>> 

There is no way after the encoding that you can tell which of the \xe6 
sequences needs reversing and which doesn't. Perhaps the following is 
what you want:

>>> b = a.encode('unicode_escape')
>>> print b
\\xe6\xe6
>>> print b.decode('unicode_escape')
\xe6æ
>>> 



More information about the Python-list mailing list