regex confusion
Peter Hansen
peter at engcorp.com
Tue Dec 9 11:43:03 EST 2003
"Diez B. Roggisch" wrote:
>
> John Hunter wrote:
>
> >
> > In trying to sdebug why a certain regex wasn't working like I expected
> > it to, I came across this strange (to me) behavior. The file I am
> > trying to match definitely contains many instances of the letter 'a',
> > so I would expect the regex
> >
> > rgxPrev = re.compile('.*?a.*?')
>
> This is a bogus regex - a '*' means "zero or more occurences" for the
> expression to the left. '?' means "zero or one occurence" for the exp to
> the left.
Not true. See http://www.python.org/doc/current/lib/re-syntax.html :
*?, +?, ??
The "*", "+", and "?" qualifiers are all greedy; they match as much text
as possible. .... Adding "?" after the qualifier makes it perform the match
in non-greedy or minimal fashion; as few characters as possible will be
matched. ....
-Peter
More information about the Python-list
mailing list