[Python-bugs-list] [ python-Bugs-602444 ] non greedy match bug
SourceForge.net
noreply@sourceforge.net
Tue, 20 May 2003 22:54:29 -0700
Bugs item #602444, was opened at 2002-08-30 07:44
Message generated for change (Comment added) made by bcannon
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470
Category: Regular Expressions
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Robert Roy (rjroy)
Assigned to: Fredrik Lundh (effbot)
Summary: non greedy match bug
Initial Comment:
When using the following re to extract all objects from a
PDF file, I get a maximum recursion limit exceeded error.
Attached is a pdf file that will reproduce the error.
If I do import pre as re, it works fine.
platform is Win2k, Python 2.2.1 build #34
#######
import re
GETOBJECT = re.compile(r'\d+\s+\d+\s+obj.+?endobj',
re.I|re.S|re.M)
pdf = open('userguide.pdf', 'rb').read()
all = GETOBJECT.findall(pdf)
print len(all)
----------------------------------------------------------------------
>Comment By: Brett Cannon (bcannon)
Date: 2003-05-20 22:54
Message:
Logged In: YES
user_id=357491
Closing this since hitting the recursion limit is not a bug.
----------------------------------------------------------------------
Comment By: Robert Roy (rjroy)
Date: 2003-02-14 10:56
Message:
Logged In: YES
user_id=352797
The max recursion limit problem in the re module is well-known.
Until this limitation in the implementation is removed, to work
around it check
http://www.python.org/dev/doc/devel/lib/module-re.html
http://python/org/sf/493252
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470