Python 2.4.1 hang

Fredrik Lundh fredrik at pythonware.com
Fri Apr 15 16:21:11 EDT 2005


Mahesh wrote:

> I needed to get to the POST body and while I was trying out various
> regular expressions, one of them caused Python to hang. The Python
> process was taking up 100% of the CPU. I couldn't even see the "Max
> recursion depth exceeded message". Is this a bug?

no, it's just a very stupid way to implement a trivial operation.
 
> import re
> 
> s = \
> """POST /TradeManagement-RT3/ReportController.Servlet HTTP/1.1
> /snip>
> 
> #pattern_str = "^POST.*\\r\\n\\r((\\n)|(\\n[^\r]*))"
> #pattern_str = "^POST.*\\n((\\n)|(\\n[^\r]*&))"
> pattern_str = "^POST(.*\\n*)+\\n\\n" # <--- Offending pattern

the first .* is a variable-length match.  so is the second .*.  and then you're putting it
inside a repeated capturing group.  and then you're applying it to a moderately large
string.  the poor engine has to check zillions of combinations before finding something
that works.

if you want to split on "\r\n\r\n", use split:

    header, body = message.split("\r\n\r\n")

for more robust code, consider using the rfc822 module:

    f = StringIO.String(message)
    request = f.readline() 
    header = rfc822.Message(f)
    body = f.read()

</F>




More information about the Python-list mailing list