[issue5200] unicode.normalize gives wrong result for some characters
Martin v. Löwis
report at bugs.python.org
Tue Feb 10 19:59:23 CET 2009
Martin v. Löwis <martin at v.loewis.de> added the comment:
It is not true that normalize produces "aaoAAO". Instead, it produces
u'a\u030aa\u0308o\u0308A\u030aA\u0308O\u0308'
This is the correct result, according to the Unicode specification. It
would be incorrect to normalize them unchanged under the Unicode Normal
Form D (for decomposed); the decomposed character for 'LATIN SMALL
LETTER A WITH RING ABOVE' (for example) is 'LATIN SMALL LETTER A' +
'COMBINING RING ABOVE'.
The wikipedia article is irrelevant; refer to the Unicode specification
for a normative reference.
Closing as invalid.
----------
nosy: +loewis
resolution: -> invalid
status: open -> closed
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5200>
_______________________________________
More information about the Python-bugs-list
mailing list