Printing UTF-8

sheldon.regular at gmail.com sheldon.regular at gmail.com
Thu Sep 21 16:47:50 EDT 2006


I am new to unicode so please bear with my stupidity.

I am doing the following in a Python IDE called Wing with Python 23.

>>> s = "äöü"
>>> print s
äöü
>>> print s
äöü
>>> s
'\xc3\xa4\xc3\xb6\xc3\xbc'
>>> s.decode('utf-8')
u'\xe4\xf6\xfc'
>>> u = s.decode('utf-8')
>>> u
u'\xe4\xf6\xfc'
>>> print u.encode('utf-8')
äöü
>>> print u.encode('latin1')
äöü

Why can't I get äöü printed from utf-8 and I can from latin1?  How
can I use utf-8 exclusivly and be able to print the characters?

I also did the same thing an the same machine in a command window...
ActivePython 2.3.2 Build 230 (ActiveState Corp.) based on
Python 2.3.2 (#49, Oct 24 2003, 13:37:57) [MSC v.1200 32 bit (Intel)]
on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> s = "äöü"
>>> print s
äöü
>>> s
'\x84\x94\x81'
>>> s.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0x84 in position 0:
unexpected code byte
>>> u = s.decode('utf-8')
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeDecodeError: 'utf8' codec can't decode byte 0x84 in position 0:
unexpected code byte
>>>

Why such a difference from the IDE to the command window in what it can
do and the internal representation of the unicode?

Thanks,
Shel




More information about the Python-list mailing list