Py 1.5.2 -> Py 2.1.1 broke regular expression?

John W. Baxter jwbaxter at spamcop.com
Sat Aug 11 18:03:46 EDT 2001


In article <mailman.997562539.29253.python-list at python.org>, Roman Suzi
<rnd at onego.ru> wrote:

> On Sat, 11 Aug 2001, Stefan Schwarzer wrote:
> 
> >Hello
> >
> >I have a program which _ran fine with Python 1.5.2_ but fails with Python
> >2.1.1. Its purpose is to extract IRC log sessions from several log files
> >and sort them into one file in the correct order (by date/time).
> >
> >To do this, it uses (in effect) something like
> >  re.compile(pattern).findall(log_lines)
> >
> >That works fine with the pattern string
> >  ^Session Start: \w{3} \w{3} .*?^Session Close: \w{3} \w{3} .*?$
> >but raises a
> >  RuntimeError: maximum recursion limit exceeded
> >in re.compile(pattern).findall(log_lines) when used with the pattern
> >  ^IRC log started \w{3} \w{3} .*?(?=\012IRC log started \w{3} \w{3} |\Z)
> >
> >First, I thought it was because of the number of IRC sessions but the program
> >fails with the second pattern even for few (5) sessions. It used to handle
> >much larger numbers.
> >
> >Any hints? A bug or a change in the regular syntax; anything else?
> 
> I too encounterd such problem even without "(?=" construct... and it
> healed itself when I reduced the number of lines in the file.
> I was in hurry and forgot to record conditions of trouble.
> But I remember I too used cross-end-of-line patterns ("\n"-s)
> but not \Z.
> 
> It is not safe in Python 2.x to let re's loose on some big file.

Well, there is this from the Library Reference page for the re module
(it's still there in the 2.1 documentation...I don't know what "future"
is here):

Implementation note: The re module has two distinct implementations:
sre is the default implementation and includes Unicode support, but may
run into stack limitations for some patterns. Though this will be fixed
for a future release of Python, the older implementation (without
Unicode support) is still available as the pre module.

I think Stefan has found an instance of "some patterns".

I suppose the first thing to try would be to import and use pre rather
than re .

  --John



More information about the Python-list mailing list