[issue10254] unicodedata.normalize('NFC', s) regression

Martin v. Löwis report at bugs.python.org
Fri Dec 17 20:08:11 CET 2010


Martin v. Löwis <martin at v.loewis.de> added the comment:

> The C forms (NFC and NFKC) do canonical composition and U+FDFA is a
> compatibility composite. (BTW, makeunicodedata.py checks that maximum
> decomposed length of a character is < 19, but it would be better if it
> would compute and define a named constant, say MAXDLENGTH, to be used
> instead of literal 20.)  As far as I (and a two-line script) can tell
> the maximum length of a canonical decomposition of a character is 4.

Even better - so allowing for 20 characters should be safe.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10254>
_______________________________________


More information about the Python-bugs-list mailing list