[issue12729] Python lib re cannot handle Unicode properly due to narrow/wide bug

Matthew Barnett report at bugs.python.org
Sun Aug 14 04:06:32 CEST 2011


Matthew Barnett <python at mrabarnett.plus.com> added the comment:

You're right about starting the second search from where the first finished. Caching the position would be an advantage there.

The memory cost of extra pointers wouldn't be so bad if UTF-8 took less space than the current format.

Regex isn't used as much as in Perl. BTW, the current re module was introduced in Python 1.5, the previous regex and regsub modules being removed in Python 2.5.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12729>
_______________________________________


More information about the Python-bugs-list mailing list