search/replace in Python

John Machin sjmachin at lexicon.net
Sat May 28 06:26:25 EDT 2005


Vamsee Krishna Gomatam wrote:
> Hello,
>     I'm having some problems understanding Regexps in Python. I want
> to replace "<google>PHRASE</google>" with
> "<a href=http://www.google.com/search?q=PHRASE>PHRASE</a>" in a block of 
> text. How can I achieve this in Python? Sorry for the naive question but 
> the documentation is really bad :-(

VKG,

Sorry you had such difficulty; if you can explain in a bit more detail, 
we could perhaps fix that "really bad" [compared with what?] documentation.

Whether you are doing a substitution programatically in Python, or in 
any other language, or manually using a text editor, you need to specify 
some input text, a pattern, and a replacement.

The syntax for pattern and replacement for what you want to do differs 
in only minor details (if at all) among major languages and common text 
editors, and it's been that way for years. For example, see the 
retro-computing museum exhibit at the end of this posting.

So, did you have any problem determining that PHRASE is represented by 
(.*) -- good enough if there is only one occurrence of your target -- in 
the pattern, and by \1 in the replacement?

Did you have a problem with this part of the docs (which is closely 
followed by an example with \1 in the replacement), and if so, what was 
the problem?

    sub( pattern, repl, string[, count])

    Return the string obtained by replacing the leftmost non-overlapping 

    occurrences of pattern in string by the replacement repl.

Have you used regular expressions before? If not, then you shouldn't 
expect to learn how to use them from the documentation, which does 
adequately _document_ the provided functionality. This is just like how 
the manual supplied with a new car documents the car's functionality, 
but doesn't attempt to teach how to drive it. If you haven't done so 
already, you may like to do what the documentation suggests:

consult the Regular Expression HOWTO, accessible from 
http://www.python.org/doc/howto/.

And out with the magic lantern ... here's the promised blast from the 
past, using Oliver's sample input:

C:\junk>dir \bin\ed.com
[snip]
30/11/1985  04:43p              18,936 ed.com
[snip]
C:\junk>ed
Memory available : 59K bytes
 >a
This is a <google>Python</google>. And some randomnonsense test.....
.
 >p
This is a <google>Python</google>. And some randomnonsense test.....
 >s/<google>\(.*\)<\/google>/<a 
href=http:\/\/www.google.com\/search?q=\1>\1<\/a>/
 >p
This is a <a href=http://www.google.com/search?q=Python>Python</a>. And 
some randomnonsense test.....
 >

Cheers,
John



More information about the Python-list mailing list