string.replace() or re.subn()
Duncan Booth
duncan at rcp.co.uk
Fri Sep 1 11:29:36 EDT 2000
bragib at my-deja.com wrote in <8oogfa$mov$1 at nnrp1.deja.com>:
>Hi:
>
>I have the following problem where I am replacing periods in certain
>names in a text file by underscores. So for example if I have
>these names in the file [set.1, set.1.1] I would like to replace
>them everywhere by set_1 and set_1_1. Now the catch is I can
>have a line like this
>
>line = '1.0, 2.0, set.1, set.1.1'
>
>for name in ['set.1', 'set.1.1']:
> line = string.replace(line, name, string.replace(name,'.','_'))
> print line
>
>1.0, 2.0, set_1, set_1.1
>1.0, 2.0, set_1, set_1.1
>
>which is not what I wanted. I moved away from using re.sub because the
>names can potentially contain characters such as
>!@#$%^&*()_-+={}[]\|~`?/<>.,
>
I'm not convinced you have given enough information here for a definitive
answer. If your names include set.1 and 1.set, and the input line contains
the text set.1.set, which of the two dots would you like replaced?
If the answer is both then try:
for name in ['set.1', 'set.1.1']:
pat = string.replace(re.escape(name), '\\.', '(\\.|_)')
repl = string.replace(name, '.', '_')
line = re.sub(pat, repl, line)
print line
which should handle all your funny characters correctly by first escaping
them, and handles the overlapping replacements by matching either . or _
Of course someone will point out that this will produce the wrong result
for an input that is already '1.0, 2.0, set_1, set_1.1', as it will change
the output when it shouldn't. If that worries you I suggest you search the
string for all locations where you could replace a dot with an underscore,
remember them, then go back and do all the replacements after you have
finished all the searches.
More information about the Python-list
mailing list