[issue18647] re.error: nothing to repeat

Tim Peters report at bugs.python.org
Sat Aug 10 19:51:45 CEST 2013


Tim Peters added the comment:

Serhiy, yes, I know the regexp you gave takes exponential time.  But:

1. This appears to have nothing to do with repeated 0-length matches.  I gave you an example of a very similar regexp that also takes exponential time, but never makes any 0-length sub-match.

2. Matthew said that Python's engine is not robust against _unbounded_ repeated matching of an empty sub-match, and so "That's the reason for the up-front check".  I was asking for an example of _that_ behavior.  I still haven't seen one.

My goal here is to understand why we're doing this check at all.  If Python's engine cannot in fact be provoked into an infinite loop, the check has at best very little value, as there are many ways to provoke exponential-time behavior, and the possibility of a repeated 0-length sub-match doesn't appear to have much (if anything) to do with it.

(By the way, exponential-time regexps can't always be rewritten to take linear time, although it takes gimmicks like back references to create examples that are inherently expensive.)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18647>
_______________________________________


More information about the Python-bugs-list mailing list