Python and UTF-8

Wolfgang Strobl ws at mystrobl.de
Thu Jan 3 17:15:09 EST 2002


On Thu, 3 Jan 2002 20:41:24 +0100, Matthias Huening
<matthias.huening at t-online.de> wrote :

>----------------------
>>>> import locale
>>>> locale.setlocale(locale.LC_ALL,"")
>'German_Germany.1252'
>>>> t = 'Mühsam ernährt sich das Eichhörnchen.'
>>>> print t.upper()
>MÜHSAM ERNÄHRT SICH DAS EICHHÖRNCHEN.
>>>> tu = unicode(t, 'latin-1').encode('utf-8')
>>>> print tu.upper()

Won't work. 'upper' doesn't know anything about the utf-8-encoding. it
assumes cp1252 according to the locale settings. 'tu' isn't an unicode
string.

>MüHSAM ERNäHRT SICH DAS EICHHöRNCHEN.
>>>> 
>----------------------

(W2K, Ger)

Welcome To PyCrust 0.7 - The Flakiest Python Shell
Sponsored by Orbtech.com - Your Source For Python Development Services
Python 2.2 (#28, Dec 21 2001, 12:21:22) [MSC 32 bit (Intel)] on win32
Type "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_ALL,"")
'German_Germany.1252'
>>> t="Mühsam ernährt sich das Eichhörnchen"
>>> tu=unicode(t,"latin-1")
>>> tu
u'M\xfchsam ern\xe4hrt sich das Eichh\xf6rnchen'
>>> tu.upper()
u'M\xdcHSAM ERN\xc4HRT SICH DAS EICHH\xd6RNCHEN'
>>> print tu.upper().encode("latin-1")
MÜHSAM ERNÄHRT SICH DAS EICHHÖRNCHEN

-- 
Wir danken für die Beachtung aller Sicherheitsbestimmungen



More information about the Python-list mailing list