re.sub and named groups

MRAB google at mrabarnett.plus.com
Wed Feb 4 12:17:54 EST 2009


Emanuele D'Arrigo wrote:
 > Hi everybody,
 >
 > I'm having a ball with the power of regular expression but I stumbled
 > on something I don't quite understand:
 >
 > theOriginalString = "spam:(?P<first>.*) ham:(?P<second>.*)"
 > aReplacementPattern = "\(\?P<first>.*\)"
 > aReplacementString= "foo"
 > re.sub(aReplacementPattern , aReplacementString, theOriginalString)
 >
 > results in :
 >
 > "spam:foo"
 >
 > instead, I was expecting:
 >
 > "spam:foo ham:"
 >
 > Why is that?
 >
 > Thanks for your help!
 >
The quantifiers eg "*" are normally greedy; they try to match as much as
possible. Therefore ".*" matches:

spam:(?P<first>.*) ham:(?P<second>.*)
                ^^^^^^^^^^^^^^^^^^^^^

You could use the lazy form "*?" which tries to match as little as
possible, eg "\(\?P<first>.*?\)" where the ".*?" matches:

spam:(?P<first>.*) ham:(?P<second>.*)
                ^^

giving "spam:foo ham:(?P<second>.*)".



More information about the Python-list mailing list