[Tutor] simple text replace

Mon Jul 27 10:10:41 CEST 2009

Hi!

Did you consider using a regex?

import re
re.sub("python\s", "snake ", "python is cool, pythonprogramming...")

Cheers!!
Albert-Jan

--- On Mon, 7/27/09, Dave Angel <davea at ieee.org> wrote:

> From: Dave Angel <davea at ieee.org>
> Subject: Re: [Tutor] simple text replace
> To: "j booth" <j8ooth at gmail.com>
> Cc: tutor at python.org
> Date: Monday, July 27, 2009, 12:41 AM
> j booth wrote:
> > Hello,
> > 
> > I am scanning a text file and replacing words with
> alternatives. My
> > difficulty is that all occurrences are replaced (even
> if they are part of
> > another word!)..
> > 
> > This is an example of what I have been using:
> > 
> >     for line in
> fileinput.FileInput("test_file.txt",inplace=1):
> >   
> >>         line =
> line.replace(original, new)
> >>         print line,
> >>     
>    fileinput.close()
> >>     
> > 
> > 
> > original and new are variables that have string values
> from functions..
> > original finds each word in a text file and old is a
> manipulated
> > replacement. Essentially, I would like to replace only
> the occurrence that
> > is currently selected-- not the rest. for example:
> > 
> > python is great, but my python knowledge is limited!
> regardless, I enjoy
> >   
> >> pythonprogramming
> >>     
> > 
> > 
> > returns something like:
> > 
> > snake is great, but my snake knowledge is limited!
> regardless, I enjoy
> >   
> >> snakeprogramming
> >>     
> > 
> > 
> > thanks so much!
> > 
> >   
> Not sure what you mean by "currently selected," you're
> processing a line at a time, and there are multiple
> legitimate occurrences of the word in the line.
> 
> The trick is to define what you mean by "word." 
> replace() has no such notion.  So we want to write a
> function such as:
> 
> given three strings, line, inword, and outword.  Find
> all occurrences of inword in the line, and replace all of
> them with outword.  The definition of word is a group
> of alphabetic characters (a-z perhaps) that is surrounded by
> non-alphabetic characters.
> 
> The approach that I'd use is to prepare a translated copy
> of the line as follows:   Replace each
> non-alphabetic character with a space.  Also insert a
> space at the beginning and one at the end.  Now, take
> the inword, and similarly add spaces at begin and end. 
> Now search this modified line for all occurrences of this
> modified inword, and make a list of the indices where it is
> found.  In your example line, there would be 2 items in
> the list.
> 
> Now, using the original line, use that list of indices to
> substitute the outword in the appropriate places.  Use
> slices to do it, preferably from right to left, so the
> indices will work even though the string is changing. 
> (The easiest way to do right to left is to reverse() the
> list.
> 
> DaveA
> 
> _______________________________________________
> Tutor maillist  -  Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>