Must be a bug in the re module [was: Why this result with the re module]

John Bond lists at asd-group.com
Tue Nov 2 23:55:44 EDT 2010


> Could you please reconsider how would you
> work with this new one and see if my steps
> are correct? If you agree with my 7-step
> execution for the new regex, then:
>
> We finally found a real bug for re.findall:
>
>>>> re.findall('((.a.)*)*', 'Mary has a lamb')
> [('', 'Mar'), ('', ''), ('', ''), ('', 'lam'), ('', ''), ('', '')]
>
>
> Cheers,
>
> Yingjie
>
>
>

Nope, I'm afraid it is lack of understanding again.

The outer capturing group that you've added is matching the entirety of 
what's matched by the inner one (which is six matches, that you now 
accept).  Because it only returns the last of them, it returns one thing 
- an empty string (that being the last thing that the inner group 
matched).  Findall is simply returning that in each of the six return 
values it needs to return because of the inner one.

You just need to accept that findall (like all of re) works fine, and if 
it doesn't seem to do what you expect, it's because the expectation is 
wrong.

Cheers, JB



More information about the Python-list mailing list