[issue4610] Unicode case mappings are incorrect
Martin v. Löwis
report at bugs.python.org
Sat Dec 20 19:41:28 CET 2008
Martin v. Löwis <martin at v.loewis.de> added the comment:
> I am trying to get a PEP together for this. Does anyone have any thoughts
> on how to handle comparison between unicode strings in a locale aware
> situation?
Implementation-wise, or specification-wise? Implementation-wise, you can
either try to use the C library, or ICU. For portability, ICU is better;
for maintenance, the C library. Specification-wise: it should just
Do The Right Thing, and probably be exposed either through the locale
module, or through locale objects (in case you want to operate on
multiple different locales in a single program) - see other OO languages
on how they provide locales.
> Should __lt__ and __gt__ be specified as ignoring locale?
Yes.
> In which case do
> we need to add a new method for doing locale aware comparisons?
No. Collation is a feature of the locale, not of the strings.
> Should locale be a property of the string, an argument passed to
> upper/lower/isupper/islower/swapcase/capitalize/sort or global state
> (locale module...)?
Either global state, or the object *that gets the strings passed to it*.
> Should doing a locale aware comparison of two strings with different
> locales throw an exception?
Strings should not be tied into locales.
> Should locales be represented as objects or just a string like "en_GB"?
If you want to have multiple of them simultaneously, you need objects.
You still need to identify them by name.
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4610>
_______________________________________
More information about the Python-bugs-list
mailing list