Replace various regex

Jean-Michel Pichavant jeanmichel at sequans.com
Mon Feb 15 09:03:40 EST 2010


Martin wrote:
> Hi,
>
> I am trying to come up with a more generic scheme to match and replace
> a series of regex, which look something like this...
>
> 19.01,16.38,0.79,1.26,1.00   !  canht_ft(1:npft)
> 5.0, 4.0, 2.0, 4.0, 1.0      !  lai(1:npft)
>
> Ideally match the pattern to the right of the "!" sign (e.g. lai), I
> would then like to be able to replace one or all of the corresponding
> numbers on the line. So far I have a rather unsatisfactory solution,
> any suggestions would be appreciated...
>
> The file read in is an ascii file.
>
> f = open(fname, 'r')
> s = f.read()
>
> if CANHT:
>     s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+   !
> canht_ft", CANHT, s)
>
> where CANHT might be
>
> CANHT = '115.01,16.38,0.79,1.26,1.00   !  canht_ft'
>
> But this involves me passing the entire string.
>
> Thanks.
>
> Martin
>   

I remove all lines containing things like 9*0.0 in your file, cause I 
don't know what they mean and how to handle them. These are not numbers.

import re

replace = {
    'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99
    't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88
    }

testBuffer = """
 0.749, 0.743, 0.754, 0.759  !  stheta(1:sm_levels)(top to bottom)
0.46                         !  snow_grnd
276.78,277.46,278.99,282.48  !  t_soil(1:sm_levels)(top to bottom)
19.01,16.38,0.79,1.26,1.00   !  canht_ft(1:npft)
200.0, 4.0, 2.0, 4.0, 1.0 !  lai(1:npft)
"""

outputBuffer = ''
for line in testBuffer.split('\n'):
    for key, (index, repl) in replace.items():
        if key in line:
            parameters = {
                'n' : '[\d\.]+', # given you example you have to change 
this one, I don't know what means 9*0.0 in your file
                'index' : index - 1,
            }
            # the following pattern will silently match any digit before 
the <index>th digit is found, and use a capturing parenthesis for the last
            pattern = 
'(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' % 
parameters # regexp are sometimes a nightmare to read
            line = re.sub(pattern, r'\1 '+repl+r'\3' , line)
            break
    outputBuffer += line +'\n'

print outputBuffer




More information about the Python-list mailing list