[issue2650] re.escape should not escape underscore
Morten Lied Johansen
report at bugs.python.org
Thu Jun 26 16:45:12 CEST 2008
Morten Lied Johansen <mortenjo at ifi.uio.no> added the comment:
One issue that the current implementation has, which I can't see have
been commented on here, is that it kills utf8 characters (and probably
every other character encoding that is multi-byte).
A é character in an utf8 encoded string will be represented by two
bytes. When passed through re.escape, those two bytes are checked
individually, and both are considered non-alphanumeric, and is
consequently escaped, breaking the utf8 string into complete gibberish
instead.
----------
nosy: +mortenlj
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2650>
_______________________________________
More information about the Python-bugs-list
mailing list