Unicode Question

David Pratt fairwinds at eastlink.ca
Mon Jan 9 20:00:14 EST 2006


Hi. I am working through some tutorials on unicode and am hoping that 
someone can help explain this for me.  I am on mac platform using python 
2.4.1 at the moment.  I am experimenting with unicode with the 3/4 symbol.

I want to prepare strings for db storage that come from normal Windows 
machine (cp1252) so my understanding is to unicode and encode to utf-8 
and to store properly. Since data will be used on the web I would not 
have to change my encoding when extracting from the database. This first 
example I believe simulates this with the 3/4 symbol. Here I want to 
store '\xc2\xbe' in my database.

 >>> tq = u'\xbe'
 >>> tq_utf = tq.encode('utf8')
 >>> tq, tq_utf
(u'\xbe', '\xc2\xbe')

To unicode withat a valiable, my understanding is that I can unicode and 
encode at the same time

 >>> tq = '\xbe'
 >>> tq_utf = unicode(tq, 'utf-8')
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0xbe in position 0: 
unexpected code byte

This is not working for me. Can someone explain why. Many thanks.

Regards,
David



More information about the Python-list mailing list