[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace
Ezio Melotti
report at bugs.python.org
Fri Sep 30 22:30:42 CEST 2011
Ezio Melotti <ezio.melotti at gmail.com> added the comment:
Leaving named sequences for unicodedata.lookup() only (and not for \N{}) makes sense.
The list of aliases is so small (11 entries) that I'm not sure using a binary search for it would bring any advantage. Having a single lookup algorithm that looks in both tables doesn't work because the aliases lookup must be in _getcode for \N{...} to work, whereas the lookup of named sequences will happen in unicodedata_lookup (Modules/unicodedata.c:1187).
I think we can leave the for loop over aliases in _getcode and implement a separate (and binary) search in unicodedata_lookup for the named sequences. Does that sound fine?
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12753>
_______________________________________
More information about the Python-bugs-list
mailing list