[Python-Dev] Bug? re.finditer fails to terminate with empty match

Kevin J. Butler python-kbutler at sabaydi.com
Wed Oct 1 19:00:01 EDT 2003


The iterator returned by re.finditer appears to not terminate if the 
final match is empty, but rather keeps returning the final (empty) match.

Is this a bug in _sre?  If so, I'll be happy to file it, though fixing 
it is a bit beyond my _sre experience level at this point.  The solution 
would appear to be to either a check for duplicate match in 
iterator.next(), or to increment position by one after returning an 
empty match (which should be OK, because if a non-empty match started at 
that location, we would have returned it instead of the empty match).

Code to illustrate the failure:

from re import finditer

last = None
for m in finditer( ".*", "asdf" ):
    if last == m.span():
        print "duplicate match:", last
        break
    print m.group(), m.span()
    last = m.span()
   
---
asdf (0, 4)
 (4, 4)
duplicate match: (4, 4)
---

findall works:

print re.findall( ".*", "asdf" )
['asdf', '']

Workaround is to explicitly check for a duplicate span, as I did above, 
or to check for a duplicate end(), which avoids the final empty match

kb




More information about the Python-Dev mailing list