[issue18234] Unicodedata module should provide access to codepoint aliases

Marc-Andre Lemburg report at bugs.python.org
Mon Jun 24 09:54:05 CEST 2013


Marc-Andre Lemburg added the comment:

On 23.06.2013 22:43, Alexander Belopolsky wrote:
> 
> Alexander Belopolsky added the comment:
> 
> unicodedata.name() was discussed in #12353 (msg144739) where MvL argued that misspelled names are better than corrected because they are more likely to appear misspelled in other sources.  I am not sure I buy this argument.  Someone googling for 'BYZANTINE MUSICAL SYMBOL FHTORA SKLIRON CHROMA VASIS' will probably just enter BYZANTINE VASIS and find what he or she needs.  A more likely scenario is someone trying to get all FTHORA symbols using a naive code like this: [hex(i) for i in range(1114112) if 'FTHORA' in ud.name(chr(i), '')].
> 
> Even more likely scenario is someone seeing a fancy symbol on the web and wanting to use it in a python program.  Such programmer would copy the symbol to python prompt, call unicode.name() and copy the result in the program.  Do we want to encourage people to perpetuate the mistake that Unicode has corrected?
> 
> I don't think the issue of control codes names was discussed in #12353.  I see no downside with returning the first alias in case no name is present.

We should stick to the rules. Please leave the function as it
is, i.e. a 1-1 mapping to the official, non-changing Unicode
name reference (including spelling errors, etc). Same with
code points that have no name.

If you want to expose the aliases, you can do so in a new
function, say .aliases() which then returns the list of
aliases of a character (including the original name,
if available).

If we change the return values of .name() to whatever we think
would be more usable, we'd be modifying how Python programmers
see the Unicode database. That's not the purpose of the module.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18234>
_______________________________________


More information about the Python-bugs-list mailing list