[issue4610] Unicode case mappings are incorrect

Martin v. Löwis report at bugs.python.org
Sat Dec 20 19:41:28 CET 2008


Martin v. Löwis <martin at v.loewis.de> added the comment:

> I am trying to get a PEP together for this. Does anyone have any thoughts 
> on how to handle comparison between unicode strings in a locale aware 
> situation?

Implementation-wise, or specification-wise? Implementation-wise, you can
either try to use the C library, or ICU. For portability, ICU is better;
for maintenance, the C library. Specification-wise: it should just
Do The Right Thing, and probably be exposed either through the locale
module, or through locale objects (in case you want to operate on
multiple different locales in a single program) - see other OO languages
on how they provide locales.

> Should __lt__ and __gt__ be specified as ignoring locale?

Yes.

> In which case do 
> we need to add a new method for doing locale aware comparisons?

No. Collation is a feature of the locale, not of the strings.

> Should locale be a property of the string, an argument passed to 
> upper/lower/isupper/islower/swapcase/capitalize/sort or global state 
> (locale module...)?

Either global state, or the object *that gets the strings passed to it*.

> Should doing a locale aware comparison of two strings with different 
> locales throw an exception?

Strings should not be tied into locales.

> Should locales be represented as objects or just a string like "en_GB"?

If you want to have multiple of them simultaneously, you need objects.
You still need to identify them by name.

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4610>
_______________________________________


More information about the Python-bugs-list mailing list