Incorrect title case?

"Martin v. Löwis" martin at v.loewis.de
Sat Jan 17 18:15:00 EST 2009


> A value of zero for ctype->title should be interpreted simply as the
> offset to add to the ordinal, as it is in the sibling _PyUnicode_To
> (Upper|Lower)case functions. 

Interestingly enough, according to the spec of UnicodeData.txt,
these should *not* be siblings. Refer to

http://www.unicode.org/Public/UNIDATA/UCD.html

For lower and upper case, it says

Note: The simple uppercase is omitted in the data file if the uppercase
is the same as the code point itself.

whereas for titlecase, it says

Note: The simple titlecase may be omitted in the data file if the
titlecase is the same as the uppercase.

So unicodectype is right to fall back to uppercase if no titlecase
mapping is given.

However, this looks like a bug in UCD.html: they probably should have
the same note for titlecase as they have for lower and uppercase
(at least, that's how UnicodeData seems to be generated).

Regards,
Martin



More information about the Python-list mailing list