Replace various regex

Martin mdekauwe at gmail.com
Mon Feb 15 09:13:17 EST 2010


On Feb 15, 2:03 pm, Jean-Michel Pichavant <jeanmic... at sequans.com>
wrote:
> Martin wrote:
> > Hi,
>
> > I am trying to come up with a more generic scheme to match and replace
> > a series of regex, which look something like this...
>
> > 19.01,16.38,0.79,1.26,1.00   !  canht_ft(1:npft)
> > 5.0, 4.0, 2.0, 4.0, 1.0      !  lai(1:npft)
>
> > Ideally match the pattern to the right of the "!" sign (e.g. lai), I
> > would then like to be able to replace one or all of the corresponding
> > numbers on the line. So far I have a rather unsatisfactory solution,
> > any suggestions would be appreciated...
>
> > The file read in is an ascii file.
>
> > f = open(fname, 'r')
> > s = f.read()
>
> > if CANHT:
> >     s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+   !
> > canht_ft", CANHT, s)
>
> > where CANHT might be
>
> > CANHT = '115.01,16.38,0.79,1.26,1.00   !  canht_ft'
>
> > But this involves me passing the entire string.
>
> > Thanks.
>
> > Martin
>
> I remove all lines containing things like 9*0.0 in your file, cause I
> don't know what they mean and how to handle them. These are not numbers.
>
> import re
>
> replace = {
>     'snow_grnd' : (1, '99.99,'), # replace the 1st number by 99.99
>     't_soil' : (2, '88.8,'), # replace the 2nd number by 88.88
>     }
>
> testBuffer = """
>  0.749, 0.743, 0.754, 0.759  !  stheta(1:sm_levels)(top to bottom)
> 0.46                         !  snow_grnd
> 276.78,277.46,278.99,282.48  !  t_soil(1:sm_levels)(top to bottom)
> 19.01,16.38,0.79,1.26,1.00   !  canht_ft(1:npft)
> 200.0, 4.0, 2.0, 4.0, 1.0 !  lai(1:npft)
> """
>
> outputBuffer = ''
> for line in testBuffer.split('\n'):
>     for key, (index, repl) in replace.items():
>         if key in line:
>             parameters = {
>                 'n' : '[\d\.]+', # given you example you have to change
> this one, I don't know what means 9*0.0 in your file
>                 'index' : index - 1,
>             }
>             # the following pattern will silently match any digit before
> the <index>th digit is found, and use a capturing parenthesis for the last
>             pattern =
> '(\s*(?:(?:%(n)s)[,\s]+){0,%(index)s})(?:(%(n)s)[,\s]+)(.*!.*)' %
> parameters # regexp are sometimes a nightmare to read
>             line = re.sub(pattern, r'\1 '+repl+r'\3' , line)
>             break
>     outputBuffer += line +'\n'
>
> print outputBuffer

Thanks I will take a look. I think perhaps I was having a very slow
day when I posted and realised I could solve the original problem more
efficiently and the problem wasn't perhaps as I first perceived. It is
enough to match the tag to the right of the "!" sign and use this to
adjust what lies on the left of the "!" sign. Currently I have
this...if anyone thinks there is a neater solution I am happy to hear
it. Many thanks.

variable_tag = 'lai'
variable = [200.0, 60.030, 0.060, 0.030, 0.030]

# generate adjustment string
variable = ",".join(["%s" % i for i in variable]) + ' !  ' +
variable_tag

# call func to adjust input file
adjustStandardPftParams(variable, variable_tag, in_param_fname,
out_param_fname)

and the inside of this func looks like this

def adjustStandardPftParams(self, variable, variable_tag, in_fname,
out_fname):

    f = open(in_fname, 'r')
    of = open(out_fname, 'w')
    pattern_found = False

    while True:
        line = f.readline()
        if not line:
            break
        pattern = re.findall(r"!\s+"+variable_tag, line)
        if pattern:
            print 'yes'
            print >> of, "%s" % variable
	    pattern_found = True

        if pattern_found:
            pattern_found = False
        else:
            of.write(line)

    f.close()
    of.close()

    return



More information about the Python-list mailing list