[Patches] [ python-Patches-527371 ] Fix for sre bug 470582
noreply@sourceforge.net
noreply@sourceforge.net
Wed, 06 Nov 2002 06:24:20 -0800
Patches item #527371, was opened at 2002-03-08 13:14
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527371&group_id=5470
Category: Modules
Group: None
>Status: Closed
Resolution: Accepted
Priority: 8
Submitted By: Greg Chapman (glchapman)
Assigned to: Fredrik Lundh (effbot)
Summary: Fix for sre bug 470582
Initial Comment:
Bug report 470582 points out that nested groups can
produces matches in sre even if the groups within
which they are nested do not match:
>>> m = sre.search(r"^((\d)\:)?(\d\d)\.(\d\d\d)
$", "34.123")
>>> m.groups()
(None, '3', '34', '123')
>>> m = pre.search(r"^((\d)\:)?(\d\d)\.(\d\d\d)
$", "34.123")
>>> m.groups()
(None, None, '34', '123')
I believe this is because in the handling of
SRE_OP_MAX_UNTIL, state->lastmark is being reduced
(after "((\d)\:)" fails) without NULLing out the now-
invalid entries at the end of the state->mark array.
In the other two cases where state->lastmark is
reduced (specifically in SRE_OP_BRANCH and
SRE_OP_REPEAT_ONE) memset is used to NULL out the
entries at the end of the array. The attached patch
does the same thing for the SRE_OP_MAX_UNTIL case.
This fixes the above case and does not break anything
in test_re.py.
----------------------------------------------------------------------
>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2002-11-06 14:24
Message:
Logged In: YES
user_id=7887
Applied as:
Lib/test/re_tests.py:1.30->1.31
Lib/test/test_sre.py:1.37->1.38
Misc/NEWS:1.511->1.512
Modules/_sre.c:2.83->2.84
Thank you very much!
----------------------------------------------------------------------
Comment By: Greg Chapman (glchapman)
Date: 2002-10-04 16:51
Message:
Logged In: YES
user_id=86307
Assuming this patch is acceptable (I see it has not yet been
applied to _sre.c), I wonder if it would be a good candidate for a
backport to 2.2.2? (Though it still lacks a fix for the lastindex
problem.)
----------------------------------------------------------------------
Comment By: Greg Chapman (glchapman)
Date: 2002-08-12 21:17
Message:
Logged In: YES
user_id=86307
I noticed recently that the lastindex attribute of match objects
is now documented, so I believe that the lastindex problem I
described in my March 8 posting needs to be fixed. Simply,
lastindex may claim that a group matched when in fact it
didn't (because lastindex does not get updated when
lastmark is reset to a lower value):
>>> m = sre.match('(\d)?\d\d', '12')
>>> m.groups()
(None,)
>>> m.lastindex
1
----------------------------------------------------------------------
Comment By: Fredrik Lundh (effbot)
Date: 2002-07-12 11:11
Message:
Logged In: YES
user_id=38376
(bumped priority as a reminder to self) /F
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-03-08 18:28
Message:
Logged In: YES
user_id=31435
Assigned to /F -- he's the expert here.
----------------------------------------------------------------------
Comment By: Greg Chapman (glchapman)
Date: 2002-03-08 15:23
Message:
Logged In: YES
user_id=86307
I'm pretty sure the memset is correct; state->lastmark is
the index of last mark written to (not the index of the
next potential write).
Also, it occurred to me that there is another related error
here:
>>> m = sre.search(r'^((\d)\:)?\d\d\.\d\d\d$', '34.123')
>>> m.groups()
(None, None)
>>> m.lastindex
2
In other words, lastindex claims that group 2 was the last
that matched, even though it didn't really match. Since
lastindex is undocumented, this probably doesn't matter too
much. Still, it probably should be reset if it is pointing
to a group which gets "unmatched" when state->lastmark is
reduced. Perhaps a function like the following should be
added for use in the three places where state->lastmark is
reset to a previous value:
void lastmark_restore(SRE_STATE *state, int lastmark)
{
assert(lastmark >= 0);
if (state->lastmark > lastmark) {
int lastvalidindex =
(lastmark == 0) ? -1 : (lastmark-1)/2+1;
if (state->lastindex > lastvalidindex)
state->lastindex = lastvalidindex;
memset(
state->mark + lastmark + 1, 0,
(state->lastmark - lastmark) * sizeof(void*)
);
}
state->lastmark = lastmark;
}
----------------------------------------------------------------------
Comment By: Neal Norwitz (nnorwitz)
Date: 2002-03-08 13:29
Message:
Logged In: YES
user_id=33168
Confirmed that the test w/o fix fails
and the test passes with the fix to _sre.c.
But I'm not sure if the memset can go too far:
memset(state->mark + lastmark + 1, 0,
(state->lastmark - lastmark) * sizeof(void*));
I can try under purify, but that doesn't guarantee anything.
----------------------------------------------------------------------
Comment By: Greg Chapman (glchapman)
Date: 2002-03-08 13:20
Message:
Logged In: YES
user_id=86307
I forgot: here's a patch for re_tests.py which adds the
case from the bug report as a test.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=527371&group_id=5470