[issue14068] problem with re split

Ramchandra Apte report at bugs.python.org
Tue Feb 21 11:50:35 CET 2012


Ramchandra Apte <maniandram01 at gmail.com> added the comment:

The problem is not in re, it is because you are passing '。' to re.split which in Python 2.x is actually passed as '\xe3\x80\x82'.
You should pass u'。' to re.compile.
Could we raise a SyntaxError when in a progam a unicode character is in a bytes string?
Python 3 does so; it raises "SyntaxError: bytes can only contain ASCII literal characters." when you execute b'。'

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue14068>
_______________________________________


More information about the Python-bugs-list mailing list