how to use unicode?
"Martin v. Löwis"
martin at v.loewis.de
Mon Jan 27 04:15:05 EST 2003
gfu wrote:
> >>> u'行都'
> u'\xcb\xc6\xcd\xf8\xd2\xb3'
In Python 2.2, you cannot put non-ASCII characters into a Unicode
literals(*). In Python 2.3, this is possible, but only if you declare
the file encoding (i.e. you cannot enter them readily in interactive mode).
So if you want those characters in a string, you need to write
u'\u884c\u90fd'
Here, U+884C and U+90FD are the Unicode code points of the two
characters you show above. Alternatively, writing
unicode('行都', 'gb2312')
should also work, provided you have a codec for gb2312 installed (and
provided your input is really encoded in gb2312).
HTH,
Martin
(*) Strictly speaking, you can put any Latin-1 into a Unicode literal if
the file is encoded in Latin-1.
More information about the Python-list
mailing list