[issue1813] Codec lookup failing under turkish locale

Sat Feb 16 23:20:16 CET 2008

Marc-Andre Lemburg added the comment:

I agree that it's a bit unfortunate that the 8-bit string APIs in Python
use the locale aware C functions per default (this should really be
reversed: there should be locale-aware .upper() and .lower() methods and
the the standard ones should work just like the Unicode ones - without
dependency on the locale, using ASCII mappings), but for historical
reasons this cannot easily be changed.

.lower() and .upper() for 8-bit strings were always locale dependent and
before the addition of Unicode, setting the locale was the most common
way to make an application understand different character sets.

In Python 3k the problem will probably go away, since .lower() and
.upper() will then no longer depend on the locale.

Perhaps we should just convert a few of the cases you found to using
Unicode strings instead of 8-bit strings in 2.6 ?! That would both make
the code more portable and also provide a clear statement of "this is a
text string", making porting to Py3k easier.

----------
nosy: +lemburg

__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue1813>
__________________________________