[issue12753] \N{...} neglects formal aliases and named sequences from Unicode charnames namespace

Martin v. Löwis report at bugs.python.org
Fri Sep 30 12:00:49 CEST 2011


Martin v. Löwis <martin at v.loewis.de> added the comment:

I propose to use a better lookup algorithm using binary search, and then integrate the NamedSequences into this as well. The search result could be a record

 struct {
   char *name;
   int len;
   Py_UCS4 chars[3]; /* no sequence is more than 3 chars */
 }

You would have two tables for these: one for the aliases, and one for the named sequences.

_getcode would continue to return a single char only, and thus not support named sequences. lookup could well return strings longer than 1, but only in 3.3.

I'm not sure that \N escapes should support named sequences: people rightfully expect that each escaped element in a string literal constitutes exactly one character.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12753>
_______________________________________


More information about the Python-bugs-list mailing list