unicode codecs
Peter Otten
__peter__ at web.de
Mon Feb 9 17:45:55 EST 2004
Ivan Voras wrote:
> When concatenating strings (actually, a constant and a string...) i get
> the following error:
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 1:
> ordinal not in range(128)
>
> Now I don't think either string is unicode, but I'm working with
> win32api so it might be... :) The point is: I know all values will fit
> in a particular code page (iso-8859-2), so how do I change the 'ascii'
> codec in the above error into something that will work?
You can either convert all strings to unicode or to iso-8859-2.
A hands on approach:
>>> u,s
(u'R\xfcbe', 'R\xfcbe')
>>> u+s
Traceback (most recent call last):
File "<stdin>", line 1, in ?
UnicodeDecodeError: 'ascii' codec can't decode byte 0xfc in position 1:
ordinal not in range(128)
This error is prevented by an explicit conversion:
>>> u.encode("iso-8859-1") + s
'R\xfcbeR\xfcbe'
or
>>> u + s.decode("iso-8859-1")
u'R\xfcbeR\xfcbe'
If you aren't sure which string is unicode and which is not:
>>> def toiso(s):
... if isinstance(s, unicode):
... return u.encode("iso-8859-1")
... return s
...
>>> toiso(u) + toiso(s)
'R\xfcbeR\xfcbe'
Peter
More information about the Python-list
mailing list