[issue12749] lib re cannot match non-BMP ranges (all versions, all builds)

Ezio Melotti report at bugs.python.org
Sun Aug 14 18:16:28 CEST 2011


Ezio Melotti <ezio.melotti at gmail.com> added the comment:

I haven't looked at the code, but I think that the re module is just trying to calculate the range between the low surrogate of 𝒜 and the high surrogate of 𝒵.
If this is the case, this is the "usual bug" that narrow builds have.

Also note that re.search(u"[\N{MATHEMATICAL SCRIPT CAPITAL A}-\N{MATHEMATICAL SCRIPT CAPITAL Z}]".encode('utf-8'), u"\N{MATHEMATICAL SCRIPT CAPITAL C}".encode('utf-8'), re.UNICODE)
"works", but it returns a wrong result.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12749>
_______________________________________


More information about the Python-bugs-list mailing list