[Python-3000] Unicode IDs -- why NFC? Why allow ligatures?

"Martin v. Löwis" martin at v.loewis.de
Tue Jun 5 18:56:37 CEST 2007


> I'd love to get rid of full-width ASCII and halfwidth kana (via
> compatibility decomposition).  Native Japanese speakers often use them
> interchangably with the "proper" versions when correcting typos and
> updating numbers in a series.  Ugly, to say the least.  I don't think
> that native Japanese would care, as long as the decomposition is done
> internally to Python.

Not sure what the proposal is here. If people say "we want the PEP do
NFKC", I understand that as "instead of saying NFC, it should say
NFKC", which in turn means "all identifiers are converted into the
normal form NFKC while parsing".

With that change, the full-width ASCII characters would still be
allowed in source - they just wouldn't be different from the regular
ones anymore when comparing identifiers.

Another option would be to require that the source is in NFKC already,
where I then ask again what precisely that means in presence of
non-UTF source encodings.

Regards,
Martin


More information about the Python-3000 mailing list