regex searching hangs
Rob Renaud
rpgnmets at aol.com
Sun Feb 8 16:30:13 EST 2004
fortepianissimo at yahoo.com.tw (Fortepianissimo) wrote in message news:<ef08387c.0402080729.621632ca at posting.google.com>...
> Could someone explains why the following code hangs (Python 2.3.3)?
>
> --- CODE STARTS ---
> import re
>
> p=re.compile(r'\S+(?:\.\S+)+\.com')
>
> t='......................................'
> p.search(t)
> --- CODE ENDS ---
>
> From the syntax and semantics of regex, the entire t should be
> consumed by the pattern '\S+(?:\.\S+)+' and the search() call should
> return None. Is search() doing a depth-first search and can't pulling
> itself out? It doesn't seem right to me, or I'm missing something?
>
> BTW, I don't think any regex matching should ever hang in any
> circumstances - correct me if I'm wrong: is halting problem a problem
> here?
>
> Thanks!
It doesn't halt, it just seems you found an exponential time search,
searching at length i seems to take about 1.6 times the time it takes
to search at length i-1.
import re
import time
p=re.compile(r'\S+(?:\.\S+)+\.com')
t = '.'
for i in range(100):
start = time.time()
p.search(t*i)
print i, time.time() - start
How should you improve it? I don't know, I can't even read the regexp
:)
More information about the Python-list
mailing list