regexp non-greedy matching bug?

Sam Pointon free.condiments at gmail.com
Sat Dec 3 23:04:28 EST 2005


My understanding of .*? and its ilk is that they will match as little
as is possible for the rest of the pattern to match, like .* will match
as much as possible. In the first instance, the first (.*?) did not
have to match anything, because all of the rest of the pattern can be
matched 0 or more times. I think that such a situation (non-greedy
operator followed by operators that match 0 or more times) will never
match. However, in the second instance, it has to match something later
on in the string, so it will capture something.

I believe that this is an operator precedence problem (greedy ? losing
to .*?), but is to be expected. So, if this is the case, by all means
it should be added in a note to the docs.

However, I am not a regular expression expert, so my analysis of the
situation may be well off the mark.




More information about the Python-list mailing list