[ python-Bugs-1054564 ] RE '*.?' cores if len of found string exceeds 10000

SourceForge.net noreply at sourceforge.net
Tue Oct 26 14:55:58 CEST 2004


Bugs item #1054564, was opened at 2004-10-26 12:55
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1054564&group_id=5470

Category: Regular Expressions
Group: Python 2.2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Rob (rwhent)
Assigned to: Fredrik Lundh (effbot)
Summary: RE '*.?' cores if len of found string exceeds 10000

Initial Comment:
Whilst parsing some extremely long strings I found that the
re.match causes segmentation faults on Solaris 2.8
when strings being matched contain '*.?' and the
contents of the regex which matches this part of the
regex exceeds 10000 chars (actually it seemed to be
exactly at 8192 chars)

This is the regex used:

    if re.match('^.*?\[\s*[A-Za-z_0-9]+\s*\].*',string): 

This regex looks for '[alphaNum_]' present in a large
string

When it failed the string was 8192 chars long with no
matching '[alphaNum_]' present. If I reduce the length
of the string below 8192 it works ok.

This is a major issue to my application as some string
to be parsed are very large. I saw some discussion on
another bulletin board with a similar issue



----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1054564&group_id=5470


More information about the Python-bugs-list mailing list