[Tutor] simple text replace
Albert-Jan Roskam
fomcl at yahoo.com
Mon Jul 27 10:10:41 CEST 2009
Hi!
Did you consider using a regex?
import re
re.sub("python\s", "snake ", "python is cool, pythonprogramming...")
Cheers!!
Albert-Jan
--- On Mon, 7/27/09, Dave Angel <davea at ieee.org> wrote:
> From: Dave Angel <davea at ieee.org>
> Subject: Re: [Tutor] simple text replace
> To: "j booth" <j8ooth at gmail.com>
> Cc: tutor at python.org
> Date: Monday, July 27, 2009, 12:41 AM
> j booth wrote:
> > Hello,
> >
> > I am scanning a text file and replacing words with
> alternatives. My
> > difficulty is that all occurrences are replaced (even
> if they are part of
> > another word!)..
> >
> > This is an example of what I have been using:
> >
> > for line in
> fileinput.FileInput("test_file.txt",inplace=1):
> >
> >> line =
> line.replace(original, new)
> >> print line,
> >>
> fileinput.close()
> >>
> >
> >
> > original and new are variables that have string values
> from functions..
> > original finds each word in a text file and old is a
> manipulated
> > replacement. Essentially, I would like to replace only
> the occurrence that
> > is currently selected-- not the rest. for example:
> >
> > python is great, but my python knowledge is limited!
> regardless, I enjoy
> >
> >> pythonprogramming
> >>
> >
> >
> > returns something like:
> >
> > snake is great, but my snake knowledge is limited!
> regardless, I enjoy
> >
> >> snakeprogramming
> >>
> >
> >
> > thanks so much!
> >
> >
> Not sure what you mean by "currently selected," you're
> processing a line at a time, and there are multiple
> legitimate occurrences of the word in the line.
>
> The trick is to define what you mean by "word."
> replace() has no such notion. So we want to write a
> function such as:
>
> given three strings, line, inword, and outword. Find
> all occurrences of inword in the line, and replace all of
> them with outword. The definition of word is a group
> of alphabetic characters (a-z perhaps) that is surrounded by
> non-alphabetic characters.
>
> The approach that I'd use is to prepare a translated copy
> of the line as follows: Replace each
> non-alphabetic character with a space. Also insert a
> space at the beginning and one at the end. Now, take
> the inword, and similarly add spaces at begin and end.
> Now search this modified line for all occurrences of this
> modified inword, and make a list of the indices where it is
> found. In your example line, there would be 2 items in
> the list.
>
> Now, using the original line, use that list of indices to
> substitute the outword in the appropriate places. Use
> slices to do it, preferably from right to left, so the
> indices will work even though the string is changing.
> (The easiest way to do right to left is to reverse() the
> list.
>
> DaveA
>
> _______________________________________________
> Tutor maillist - Tutor at python.org
> http://mail.python.org/mailman/listinfo/tutor
>
More information about the Tutor
mailing list