String Replace only if whole word?

Fri Nov 17 11:12:19 EST 2006

>   I have been using the string.replace(from_string, to_string, len(string))
> to replace names in a file with their IP address.
>   For example, I have definitions file, that looks something like:
> 10.1.3.4   LANDING_GEAR
> 20.11.222.4   ALTIMETER_100
> 172.18.50.138 SIB
> 172.18.50.138 LAPTOP
> 172.18.51.32  WIN2000
> 127.0.0.1 LOCALHOST
> 
>   and I have a text file (a Python script) that has these names in the file.
> In most cases the string.replace() command works great. But there is one
> instance which it fails:
> Suppose I had in the file:
>  if (LAPTOP_IS_UP()):
>    It would replace the string with:
>  if ("172.18.50.138"_IS_UP()):
> 
> Is there any easy way to avoid this, only replace if a whole
> word matches? I probably need something which determines when
> a word ends, and I will define a word as containing only
> 'A'-'Z','a'-'z','0'-'9','_' . As long as the string contains
> more of the word digits after the match, don't replace?

A solution I've used in the past (whether it's good or not, I'd 
be interested to get feedback from the list)

 >>> mapping = {}
 >>> for line in file('definitions.txt'):
...     ip, name = line.split()
...     mapping[name] = '"%s"' % ip
...
 >>> import re
 >>> s = "LAPTOP LAPTOP_IS_UP"
 >>> r = re.compile(r'\b\w+\b')
 >>> r.sub(lambda match: mapping.get(match.group(0), 
match.group(0)), s)
'"172.18.50.138" LAPTOP_IS_UP'

The regexp '\b\w+\b' finds "word"s (where \w is locale specific, 
but could easily be rewritten as '\b[a-zA-Z0-9_]+\b' if you 
needed) while the '\b' portion ensures that there's a 
word-to-non-word boundary coming or going.

It then goes through and replaces every word in the string with 
either the result of looking it up in the magic mapping, or with 
its original contents.

-tkc