non-terminating regex match

Konstantin Veretennicov kveretennicov at gmail.com
Wed Apr 2 15:38:07 EDT 2008


On Wed, Apr 2, 2008 at 9:32 PM, Maurizio Vitale
<mav at cuma.polymath-solutions.lan> wrote:

> Marc 'BlackJack' Rintsch <bj_666 at gmx.net> writes:
>
> > On Wed, 02 Apr 2008 16:01:59 +0000, Maurizio Vitale wrote:
> >
> >> And yes, I'm a total beginner when it comes to Python, but it seems
> >> very strange to me that a regex match on a finite length string
> >> doesn't terminate
> >
> > It does terminate, you just don't wait long enough.  Try it with fewer
> > characters and then increase the identifier character by character and
> > watch the time of the runs grow exponentially.
> >
>
> Ok. Now, my assumption was that re.compile would produce a DFA (or NFA)
> and the match would then being linear in the size of the string.
>

No, it's easy to come up with regex that would perform exponentially by
nesting repeated groups.


> Apparently this is not the case, so is there anything that can be done
> to the regex itself to make sure that whatever re.compile produces is
> more efficient?


Avoid catastrophic backtracking:
http://www.regular-expressions.info/catastrophic.html

--
kv
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080402/e07bf0f0/attachment-0001.html>


More information about the Python-list mailing list