[issue31759] re wont recover nor fail on runaway regular expression

Tim Peters report at bugs.python.org
Wed Oct 11 21:52:17 EDT 2017


Tim Peters <tim at python.org> added the comment:

Sure!  The OP was obviously asking about the engine that ships with Python, so that's what I talked about.

Raphaël, Matthew develops an excellent replacement ("regex") for Python's re module, which you can install via, e.g., "pip install regex" (or, on Windows, "python -m pip install regex").  More info here:

https://pypi.python.org/pypi/regex/

Matthew, will that become Python's standard offering some day?  I'd be in favor of that!  It has many advantages, although it doesn't always avoid exponential-time backtracking in failing cases.  For example, `re` and `regex` both take exponential time to fail to match the regexp:

"((xy)+)+$"

against strings of the form:

"xy" * i + "y"

Increase `i` by 1, and both take about twice as long to fail to match (meaning either .match or .search), and `re` is actually quicker on my box (3.6.3 on 64-bit Win10).

In any case, I'm closing this, since there's no concrete idea on the table for a change to `re` that would actually help (e.g., people ignore warnings, and there's really no way to _guess_ whether a regexp is "taking too long" to begin with - if it's taking minutes, people immediately discover the hangup already when they interrupt the program and see that it's trying to match a regexp).

----------
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue31759>
_______________________________________


More information about the Python-bugs-list mailing list