Locale case change not working

Clodoaldo clodoaldo.pinto at gmail.com
Thu May 24 05:51:57 EDT 2007


On May 24, 6:40 am, Peter Otten <__pete... at web.de> wrote:
> Clodoaldo wrote:
> > When using unicode the case change works:
>
> >>>> print u'É'.lower()
> > é
>
> > But when using the pt_BR.utf-8 locale it doesn't:
>
> >>>> locale.setlocale(locale.LC_ALL, 'pt_BR.utf-8')
> > 'pt_BR.utf-8'
> >>>> locale.getlocale()
> > ('pt_BR', 'utf')
> >>>> print 'É'.lower()
> > É
>
> > What am I missing? I'm in Fedora Core 5 and Python 2.4.3.
>
> > # cat /etc/sysconfig/i18n
> > LANG="en_US.UTF-8"
> > SYSFONT="latarcyrheb-sun16"
>
> > Regards, Clodoaldo Pinto Neto
>
> str.lower() operates on bytes and therefore doesn't handle encodings with
> multibyte characters (like utf-8) properly:
>
> >>> u"É".encode("utf8")
> '\xc3\x89'
> >>> u"É".encode("latin1")
> '\xc9'
> >>> import locale
> >>> locale.setlocale(locale.LC_ALL, "de_DE.utf8")
> 'de_DE.utf8'
> >>> print unicode("\xc3\x89".lower(), "utf8")
> É
> >>> locale.setlocale(locale.LC_ALL, "de_DE.latin1")
> 'de_DE.latin1'
> >>> print unicode("\xc9".lower(), "latin1")
>
> é
>
> I recommend that you forget about byte strings and use unicode throughout.

Now I understand it. Thanks.

Regards, Clodoaldo Pinto Neto




More information about the Python-list mailing list