Python 2.4.1 hang
Fredrik Lundh
fredrik at pythonware.com
Fri Apr 15 16:21:11 EDT 2005
Mahesh wrote:
> I needed to get to the POST body and while I was trying out various
> regular expressions, one of them caused Python to hang. The Python
> process was taking up 100% of the CPU. I couldn't even see the "Max
> recursion depth exceeded message". Is this a bug?
no, it's just a very stupid way to implement a trivial operation.
> import re
>
> s = \
> """POST /TradeManagement-RT3/ReportController.Servlet HTTP/1.1
> /snip>
>
> #pattern_str = "^POST.*\\r\\n\\r((\\n)|(\\n[^\r]*))"
> #pattern_str = "^POST.*\\n((\\n)|(\\n[^\r]*&))"
> pattern_str = "^POST(.*\\n*)+\\n\\n" # <--- Offending pattern
the first .* is a variable-length match. so is the second .*. and then you're putting it
inside a repeated capturing group. and then you're applying it to a moderately large
string. the poor engine has to check zillions of combinations before finding something
that works.
if you want to split on "\r\n\r\n", use split:
header, body = message.split("\r\n\r\n")
for more robust code, consider using the rfc822 module:
f = StringIO.String(message)
request = f.readline()
header = rfc822.Message(f)
body = f.read()
</F>
More information about the Python-list
mailing list