[Python-bugs-list] [ python-Bugs-602444 ] non greedy match bug

SourceForge.net noreply@sourceforge.net
Fri, 14 Feb 2003 10:56:24 -0800


Bugs item #602444, was opened at 2002-08-30 10:44
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470

Category: Regular Expressions
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Robert Roy (rjroy)
Assigned to: Fredrik Lundh (effbot)
Summary: non greedy match bug

Initial Comment:
When using the following re to extract all objects from a 
PDF file, I get a maximum recursion limit exceeded error.

Attached is a pdf file that will reproduce the error.

If I do import pre as re, it works fine.

platform is Win2k, Python 2.2.1 build #34

#######
import re

GETOBJECT = re.compile(r'\d+\s+\d+\s+obj.+?endobj', 
re.I|re.S|re.M)

pdf = open('userguide.pdf', 'rb').read()
all = GETOBJECT.findall(pdf)
print len(all)


----------------------------------------------------------------------

Comment By: Robert Roy (rjroy)
Date: 2003-02-14 13:56

Message:
Logged In: YES 
user_id=352797

The max recursion limit problem in the re module is well-known.  
Until this limitation in the implementation is removed, to work 
around it check

http://www.python.org/dev/doc/devel/lib/module-re.html
http://python/org/sf/493252

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=602444&group_id=5470