Replace various regex

MRAB python at mrabarnett.plus.com
Fri Feb 12 15:30:29 EST 2010


McColgst wrote:
> On Feb 12, 2:39 pm, Martin <mdeka... at gmail.com> wrote:
>> Hi,
>>
>> I am trying to come up with a more generic scheme to match and replace
>> a series of regex, which look something like this...
>>
>> 19.01,16.38,0.79,1.26,1.00   !  canht_ft(1:npft)
>> 5.0, 4.0, 2.0, 4.0, 1.0      !  lai(1:npft)
>>
>> Ideally match the pattern to the right of the "!" sign (e.g. lai), I
>> would then like to be able to replace one or all of the corresponding
>> numbers on the line. So far I have a rather unsatisfactory solution,
>> any suggestions would be appreciated...
>>
>> The file read in is an ascii file.
>>
>> f = open(fname, 'r')
>> s = f.read()
>>
>> if CANHT:
>>     s = re.sub(r"\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+,\d+.\d+   !
>> canht_ft", CANHT, s)
>>
>> where CANHT might be
>>
>> CANHT = '115.01,16.38,0.79,1.26,1.00   !  canht_ft'
>>
>> But this involves me passing the entire string.
>>
>> Thanks.
>>
>> Martin
> 
> If I understand correctly, there are a couple ways to do it.
> One is to use .split() and split by the '!' sign, given that you wont
> have more than one '!' on a line. This will return a list of the words
> split by the delimiter, in this case being '!', so you should get back
> (19.01,16.38,0.79,1.26,1.00  ,  canht_ft(1:npft) )  and you can do
> whatever replace functions you want using the list.
> 
> check out split: http://docs.python.org/library/stdtypes.html#str.split
> 
The .split method is the best way if you process the file a line at a
time. The .split method, incidentally, accepts a maxcount argument so
that you can split a line no more than once.

> Another, is in your regular expression, you can match the first part
> or second part of the string by specifying where the '!' is,
> if you want to match the part after the '!' I would do something like
> r"[^! cahnt_ft]", or something similar (i'm not particularly up-to-
> date with my regex syntax, but I think you get the idea.)
> 
The regex would be r"(?m)^[^!]*(!.*)" to capture the '!' and the rest of
the line.

> I hope I understood correctly, and I hope that helps.
> 




More information about the Python-list mailing list