Unicode/ascii encoding nightmare

Mark Peters mpeters42 at gmail.com
Mon Nov 6 15:24:13 EST 2006


> The string below is the encoding of the norwegian word "fødselsdag".
>
> >>> s = 'f\xc3\x83\xc2\xb8dselsdag'

I'm not sure which encoding method you used to get the string above.
Here's the result of my playing with the string in IDLE:

>>> u1 = u'fødselsdag'
>>> u1
u'f\xf8dselsdag'
>>> s1 = u1.encode('utf-8')
>>> s1
'f\xc3\xb8dselsdag'
>>> u2 = s1.decode('utf-8')
>>> u2
u'f\xf8dselsdag'
>>> print u2
fødselsdag
>>>




More information about the Python-list mailing list