unicode question

Walter Dörwald walter at livinglogic.de
Mon Feb 27 12:20:19 EST 2006


Edward Loper wrote:

> [...]
> Surely there's a better way than converting back and forth 3 times?  Is
> there a reason that the 'backslashreplace' error mode can't be used with 
> codecs.decode?
> 
>  >>> 'abc \xff\xe8 def'.decode('ascii', 'backslashreplace')
> Traceback (most recent call last):
>    File "<stdin>", line 1, in ?
> TypeError: don't know how to handle UnicodeDecodeError in error callback

The backslashreplace error handler is an *error* *handler*, i.e. it 
gives you a replacement text if an input character can't be encoded. But 
a backslash character in an 8bit string is no error, so it won't get 
replaced on decoding.

What you want is a different codec (try e.g. "string-escape" or 
"unicode-escape").

Bye,
    Walter Dörwald




More information about the Python-list mailing list