[Python-3000] PEP: Supporting Non-ASCII Identifiers

"Martin v. Löwis" martin at v.loewis.de
Sun Jun 3 20:43:03 CEST 2007


Rauli Ruohonen schrieb:
> This is only almost true. Consider these two hypothetical files
> written by naive newbies:
> 
> data.py:
> 
> favorite_colors = {'Martin Löwis': 'blue'}
> 
> code.py:
> 
> import data
> 
> print data.favorite_colors['Martin Löwis']

That is an unrealistic example. It's more likely that the
second access reads

user = find_current_user()
print data.favorite_colors[user]

To deal with that safely, I would recommend to write

favorite_colors = nfc_dict({'Martin Löwis': 'blue'})

> The most important thing about normalization is that it should be
> consistent for internal strings. Similarly when reading in a text
> file, you really should normalize it first, if you're going to
> handle it as *text*, not binary.
> 
> The most common normalization is NFC, because it works best
> everywhere and causes the least amount of surprise. E.g.
> "Löwis"[2] results in "w", not in u'\u0308' (COMBINING DIAERESIS),
> which most naive users won't expect.

Sure. If you think it is worth the effort, write a PEP.
PEP 3131 is only about identifiers.

Regards,
Martin



More information about the Python-3000 mailing list