Python and UTF-8
Matthias Huening
matthias.huening at t-online.de
Thu Jan 3 14:41:24 EST 2002
Martin von Loewis <loewis at informatik.hu-berlin.de> wrote in
news:j4itajb9jx.fsf at informatik.hu-berlin.de:
>> How to use regular expressions with Unicode?
>
> Just use the re module: it fully supports Unicode.
>
Not really...
At least the combination of re.I and re.U fails on texts in German.
But that again could be a result of the combination of 'locale' and
Unicode, right?
I tried this (Win 98, Python 2.1, Idle):
----------------------
>>> import locale
>>> locale.setlocale(locale.LC_ALL,"")
'German_Germany.1252'
>>> t = 'Mühsam ernährt sich das Eichhörnchen.'
>>> print t.upper()
MÜHSAM ERNÄHRT SICH DAS EICHHÖRNCHEN.
>>> tu = unicode(t, 'latin-1').encode('utf-8')
>>> print tu.upper()
MüHSAM ERNäHRT SICH DAS EICHHöRNCHEN.
>>>
----------------------
This should work, I think. But it doesn't.
Did I miss something?
Matthias
More information about the Python-list
mailing list