replace link in html

Gilles Lenfant glenfant at NOSPAM.bigfoot.com
Thu May 29 07:35:36 EDT 2003


"Xah Lee" <xah at xahlee.org> a écrit dans le message de news:
7fe97cc4.0305290151.5bcbb454 at posting.google.com...
> how can i write a script that goes thru a directory, find each .html
> file, replace links of the form
>
> <a href="....suffix">some text</a>
>
> by some other text, where the suffix is one of .mov, .nb and others.
> Note the link may not be in one line, so i need some kind of html
> parser instead just regex search replace. (that's where i'd be stuck)
>
> i'm a python beginner.

The standard html parser (in the "htmllib" standard module) does not provide
html generation features.
The best way to achieve your work is using regex ("re" module). Sorry if
you're stuck with it.
I know that the regex pattern you need is not easy to newbies, but it's the
best solution, unless you don't have an available html marshaler.

Search for "redemo.py" in your python distro. It's a friendly GUI regexp
testing tool using Tkinter.

--Gilles





More information about the Python-list mailing list