String Replace Problem...

Peter Hansen peter at engcorp.com
Mon Feb 28 11:13:24 EST 2005


andrea.gavana at agip.it wrote:
>   'PERMX'  @PERMX1  1 34  1  20  1 6     /
...
>   'PERMX'  @PERMX10  1 34  21 41 29 34    /
...
> 
> I would like to replace all the occurrencies of the "keywords" (beginning
> with the @ (AT) symbol) with some floating point value. As an example, this
> is what I do:
> 
> # Find All Keywords Starting with @ (AT)
> regex = re.compile("[\@]\w+", re.IGNORECASE)

You don't need the [, \ or ] around the "@" as it is
not a special character... (but that's not your problem here).

> keywords = regex.findall(onread)

Here you get a list, in order, of all the matches, including
these: "@PERMX1" and "@PERMX10"

> # Try To Replace The With Floats
> for keys in keywords:
>     onread = string.replace(onread, keys, str(float(pars)))

Here you iterate through the list, replacing *all*
occurrences of each of them, one at a time, in the
full string.

Now imagine what happens to things like "@PERMX10" in the
full string when you are replacing "@PERMX1"....

> Now, I you try to run this little script, you will see that for keywords
> starting from "@PERMX10", the replaced values are WRONG. I don't know why,
> Python replace only the "@PERMX1" leaving out the last char of the keyword
> (that are 0, 1, 2, 3, 4, 5, 6 ). These values are left in the file and I
> don't get the expected result.
> 
> Does anyone have an explanation? What am I doing wrong?

Basically, you are replacing things in the string one by one without
taking into account the fact that some of those things contain
others of those things.

Looking into "re.sub" might help, although in this case you
could solve the problem by doing one of several other things.

The simplest one that comes to mind is to sort the list of
keywords in reverse order by length of string, so that you
replace the longest items first.

-Peter



More information about the Python-list mailing list