[Tutor] Substring substitution
Kent Johnson
kent37 at tds.net
Thu Sep 8 21:36:30 CEST 2005
Bernard Lebel wrote:
> Hello,
>
> I have a string, and I use a regular expression to search a match in
> it. When I find one, I would like to break down the string, using the
> matched part of it, to be able to perform some formatting and to later
> build a brand new string with the separate parts.
>
> The regular expression part works ok, but my problem is to extract the
> matched pattern from the string. I'm not sure how to do that...
>
>
> sString = 'mt_03_04_04_anim'
>
> # Create regular expression object
> oRe = re.compile( "\d\d_\d\d\_\d\d" )
>
> # Break-up path
> aString = sString.split( os.sep )
>
> # Iterate individual components
> for i in range( 0, len( aString ) ):
>
> sSubString = aString[i]
>
> # Search with shot number of 2 digits
> oMatch = oRe.search( sSubString )
>
> if oMatch != None:
> # Replace last sequence of two digits by 3 digits!!
Hi Bernard,
It sounds like you need to put some groups into your regex and use re.sub().
By putting groups in the regex you can refer to pieces of the match. For example
>>> import re
>>> s = 'mt_03_04_04_anim'
>>> oRe = re.compile( "(\d\d_\d\d\_)(\d\d)" )
>>> m = oRe.search(s)
>>> m.group(1)
'03_04_'
>>> m.group(2)
'04'
With re.sub(), you provide a replacement pattern that can refer to the groups from the match pattern. So to insert new characters between the groups is easy:
>>> oRe.sub(r'\1XX\2', s)
'mt_03_04_XX04_anim'
This may be enough power to do what you want, I'm not sure from your description. But re.sub() has another trick up its sleeve - the replacement 'expression' can be a callable which is passed the match object and returns the string to replace it with. For example, if you wanted to find all the two digit numbers in a string and add one to them, you could do it like this:
>>> def incMatch(m):
... s = m.group(0) # use the whole match
... return str(int(s)+1).zfill(2)
...
>>> re.sub(r'\d\d', incMatch, '01_09_23')
'02_10_24'
This capability can be used to do complicated replacements.
Kent
More information about the Tutor
mailing list