[Python-Dev] regex module

MRAB python at mrabarnett.plus.com
Tue Jan 12 23:10:28 CET 2010


Hi all,

I'm back on the regex module after doing other things and I'd like your
opinion on a number of matters:

Firstly, the current re module has a bug whereby it doesn't split on
zero-width matches. The BDFL has said that this behaviour should be
retained by default in case any existing software depends on it. My
question is: should my regex module still do this for Python 3?
Speaking personally, I'd like it to behave correctly, and Python 3 is
the version where backwards-compatibility is allowed to be broken.

Secondly, Python 2 is reaching the end of the line and Python 3 is the
future. Should I still release a version that works with Python 2? I'm
thinking that it could be confusing if new regex module did zero-width
splits correctly in Python 3 but not in Python 2. And also, should I
release it only for Python 3 as a 'carrot'?

Finally, the module allows some extra backslash escapes, eg \g<name>, in
the pattern. Should it treat ill-formed escapes, eg \g, as it would have
treated them in the re module?

Thanks



More information about the Python-Dev mailing list