string.replace() or re.subn()

bragib at my-deja.com bragib at my-deja.com
Fri Sep 1 14:22:47 EDT 2000


In article <8FA2A88A0duncanrcpcouk at 194.238.50.13>,
  duncan at rcp.co.uk (Duncan Booth) wrote:
> bragib at my-deja.com wrote in <8oogfa$mov$1 at nnrp1.deja.com>:
>
> >Hi:
> >
> >I have the following problem where I am replacing periods in certain
> >names in a text file by underscores.  So for example if I have
> >these names in the file [set.1, set.1.1] I would like to replace
> >them everywhere by set_1 and set_1_1.  Now the catch is I can
> >have a line like this
> >
> >line = '1.0, 2.0, set.1, set.1.1'
> >
> >for name in ['set.1', 'set.1.1']:
> >    line = string.replace(line, name, string.replace(name,'.','_'))
> >    print line
> >
> >1.0, 2.0, set_1, set_1.1
> >1.0, 2.0, set_1, set_1.1
> >
> >which is not what I wanted.  I moved away from using re.sub because
the
> >names can potentially contain characters such as
> >!@#$%^&*()_-+={}[]\|~`?/<>.,
> >
> I'm not convinced you have given enough information here for a
definitive
> answer. If your names include set.1 and 1.set, and the input line
contains
> the text set.1.set, which of the two dots would you like replaced?
>
> If the answer is both then try:
>
> for name in ['set.1', 'set.1.1']:
>     pat = string.replace(re.escape(name), '\\.', '(\\.|_)')
>     repl = string.replace(name, '.', '_')
>     line = re.sub(pat, repl, line)
>     print line
>
> which should handle all your funny characters correctly by first
escaping
> them, and handles the overlapping replacements by matching either . or
_
>
> Of course someone will point out that this will produce the wrong
result
> for an input that is already '1.0, 2.0, set_1, set_1.1', as it will
change
> the output when it shouldn't. If that worries you I suggest you search
the
> string for all locations where you could replace a dot with an
underscore,
> remember them, then go back and do all the replacements after you have
> finished all the searches.
>

I want to replace both... actually I think I found a solution.
I sort the list such that the longer name of the two appears
later in the list.  Then I do my replacements by starting from
the end of the list.  That's how I ensure that set.1.1 gets
replaced before set.1

Thanks for your elegant solution.

Bragi


Sent via Deja.com http://www.deja.com/
Before you buy.



More information about the Python-list mailing list