[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

Matthew Barnett report at bugs.python.org
Sat Aug 13 04:59:33 CEST 2011


Matthew Barnett <python at mrabarnett.plus.com> added the comment:

In a narrow build, a codepoint in the astral plane is encoded as surrogate pair.

I could implement a workaround for it in the regex module, but I think that the proper place to fix it is in the language as a whole, perhaps by implementing PEP 393 ("Flexible String Representation").

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12729>
_______________________________________


More information about the Python-bugs-list mailing list