[Python-bugs-list] [ python-Bugs-602444 ] non greedy match bug

Tue, 20 May 2003 22:54:29 -0700

Bugs item #602444, was opened at 2002-08-30 07:44
Message generated for change (Comment added) made by bcannon
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470

Category: Regular Expressions
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Robert Roy (rjroy)
Assigned to: Fredrik Lundh (effbot)
Summary: non greedy match bug

Initial Comment:
When using the following re to extract all objects from a 
PDF file, I get a maximum recursion limit exceeded error.

Attached is a pdf file that will reproduce the error.

If I do import pre as re, it works fine.

platform is Win2k, Python 2.2.1 build #34

#######
import re

GETOBJECT = re.compile(r'\d+\s+\d+\s+obj.+?endobj', 
re.I|re.S|re.M)

pdf = open('userguide.pdf', 'rb').read()
all = GETOBJECT.findall(pdf)
print len(all)

----------------------------------------------------------------------

>Comment By: Brett Cannon (bcannon)
Date: 2003-05-20 22:54

Message:
Logged In: YES 
user_id=357491

Closing this since hitting the recursion limit is not a bug.

----------------------------------------------------------------------

Comment By: Robert Roy (rjroy)
Date: 2003-02-14 10:56

Message:
Logged In: YES 
user_id=352797

The max recursion limit problem in the re module is well-known.  
Until this limitation in the implementation is removed, to work 
around it check

http://www.python.org/dev/doc/devel/lib/module-re.html
http://python/org/sf/493252

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470