Stupid Regex Question

Robert Brewer fumanchu at amor.org
Wed Apr 21 13:13:15 EDT 2004


Jonas Galvez wrote:
> > [Fredrik Lundh]
> > print re.findall("<tag>(.*?)</tag>", str)
>
> But now I'm puzzled. What is that '?' doing, exactly?
> 
> Could you point me any references?

http://docs.python.org/lib/re-syntax.html

*?, +?, ?? 
The "*", "+", and "?" qualifiers are all greedy; they match as much text
as possible. Sometimes this behaviour isn't desired; if the RE <.*> is
matched against '<H1>title</H1>', it will match the entire string, and
not just '<H1>'. Adding "?" after the qualifier makes it perform the
match in non-greedy or minimal fashion; as few characters as possible
will be matched. Using .*? in the previous expression will match only
'<H1>'.


FuManChu

P.S. Top-posting may get you flayed alive by some around here.




More information about the Python-list mailing list