UnicodeEncodeError: 'ascii' codec can't encode character u'\xb7' in position 13: ordinal not in range(128)

Piet van Oostrum piet at cs.uu.nl
Thu Jul 16 05:09:51 EDT 2009


>>>>> akhil1988 <akhilanger at gmail.com> (a) wrote:

>a> Chris,

>a> Using 

>a> print (u'line: %s' % line).encode('utf-8')

>a> the 'line' gets printed, but actually this print statement I was using just
>a> for testing, actually my code operates on 'line', on which I use line =
>a> line.decode('utf-8') as 'line' is read as bytes from a stream.

>a> And if I use line = line.encode('utf-8'), 

>a> I start getting other error like
>a> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4561:
>a> ordinal not in range(128)
>a> at line = line.replace('<<', u'«').replace('>>', u'»')

You do a Unicode replace here, so line should be a unicode string.
Therefore you have to do this before the line.encode('utf-8'), but after
the decode('utf-8'). 

It might be better to use different variables for Unicode strings and
byte code strings to prevent confusion, like:

'line' is read as bytes from a stream
uline = line.decode('utf-8')
uline = uline.replace('<<', u'«').replace('>>', u'»')
line = uline.encode('utf-8')
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list