[issue2650] re.escape should not escape underscore

Amaury Forgeot d'Arc report at bugs.python.org
Thu Jun 26 17:18:39 CEST 2008


Amaury Forgeot d'Arc <amauryfa at gmail.com> added the comment:

The escaped regexp is not utf-8 (why should it be?), but it still
matches the same bytes in the searched text, which has to be utf-8
encoded anyway:

>>> text = u"été".encode('utf-8')
>>> regexp = u"é".encode('utf-8')
>>> re.findall(regexp, text)
['\xc3\xa9', '\xc3\xa9']
>>> escaped_regexp = re.escape(regexp)
>>> re.findall(escaped_regexp, text)
['\xc3\xa9', '\xc3\xa9']

----------
nosy: +amaury.forgeotdarc

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2650>
_______________________________________


More information about the Python-bugs-list mailing list