[Python-bugs-list] [ python-Bugs-786970 ] re doesn't like (^$)*
SourceForge.net
noreply at sourceforge.net
Mon Aug 11 14:21:43 EDT 2003
Bugs item #786970, was opened at 2003-08-11 14:21
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=786970&group_id=5470
Category: Regular Expressions
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Andrew Dalke (dalke)
Assigned to: Fredrik Lundh (effbot)
Summary: re doesn't like (^$)*
Initial Comment:
Nor, for that matter, does it like "(^)*"
% python
Python 2.3 (#1, Aug 3 2003, 02:47:49)
[GCC 3.1 20020420 (prerelease)] on darwin
>>> import re
>>> re.compile("(^$)*").match("")
Segmentation fault
%
It's trying real hard to match 0 characters an infinite
number of time. :)
The segfault is caused in part by the low stacksize limit
on my OS X machine,
% limit stacksize
stacksize 512 kbytes
% limit stacksize 2000kbytes
% limit stacksize
stacksize 2000 kbytes
% python
Python 2.3 (#1, Aug 3 2003, 02:47:49)
[GCC 3.1 20020420 (prerelease)] on darwin
Type "help", "copyright", "credits" or "license" for more
information.
>>> import re
>>> re.compile("(^$)*").match("")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
RuntimeError: maximum recursion limit exceeded
>>>
which suggests that the stack recursion limit
test for the re library is not the same as the one
used for the rest of Python. (def f(): f() gives
me the expected recursion limit, and not a
segfault)
Seems like the bug could be in several places:
- the compiler doesn't handle infinite loops of
zero-character tests well (it could convert
them to a finite-loop test)
- the re matcher doesn't check that it's been
in the same place several times without
advancing any character positions
- the re matcher doesn't use the same stack
check used elsewhere in Python
- the Mac stacksize default is too low for
Python's
BTW, checking pcre ...
>>> import pre
/usr/local/lib/python2.3/pre.py:94: DeprecationWarning:
Please use the 're' module, not the 'pre' module
DeprecationWarning)
>>> pre.compile("(^$)*").match("")
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/local/lib/python2.3/pre.py", line 251, in compile
code=pcre_compile(pattern, flags, groupindex)
pcre.error: ('operand of unlimited repeat could match the
empty string', 4)
>>>
which is true, but the pattern I used should (IMHO)
be allowed to match the empty string.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=786970&group_id=5470
More information about the Python-bugs-list
mailing list