[ python-Bugs-1647489 ] zero-length match confuses re.finditer()

SourceForge.net noreply at sourceforge.net
Mon Jan 29 23:35:22 CET 2007


Bugs item #1647489, was opened at 2007-01-29 14:35
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1647489&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jacques Frechet (jfrechet)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: zero-length match confuses re.finditer()

Initial Comment:
Hi!

re.finditer() seems to incorrectly increment the current position immediately after matching a zero-length substring.  For example:

>>> [m.groups() for m in re.finditer(r'(^z*)|(\w+)', 'abc')]
[('', None), (None, 'bc')]

What happened to the 'a'?  I expected this result:

[('', None), (None, 'abc')]

Perl agrees with me:

% perl -le 'print defined($1)?"\"$1\"":"undef",",",defined($2)?"\"$2\"":"undef" while "abc" =~ /(z*)|(\w+)/g' 
"",undef
undef,"abc"
"",undef

Similarly, if I remove the ^:

>>> [m.groups() for m in re.finditer(r'(z*)|(\w+)', 'abc')]
[('', None), ('', None), ('', None), ('', None)]

Now all of the letters have fallen through the cracks!  I expected this result:

[('', None), (None, 'abc'), ('', None)]

Again, perl agrees:

% perl -le 'print defined($1)?"\"$1\"":"undef",",",defined($2)?"\"$2\"":"undef" while "abc" =~ /(z*)|(\w+)/g' 
"",undef
undef,"abc"
"",undef

If this bug has already been reported, I apologize -- I wasn't able to find it here.  I haven't looked at the code for the re module, but this seems like the sort of bug that might have been accidentally introduced in order to try to prevent the same zero-length match from being returned forever.

Thanks,
Jacques

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1647489&group_id=5470


More information about the Python-bugs-list mailing list