Parsing HTML - modify URLs
Robert Brewer
fumanchu at amor.org
Wed Jul 7 10:47:02 EDT 2004
Fuzzyman wrote:
> I am trying to parse an HTML page an only modify URLs within tags -
> e.g. inside IMG, A, SCRIPT, FRAME tags etc...
>
> I have built one that works fine using the HTMLParser.HTMLParser and
> it works fine.... on good HTML. Having done a google it looks like
> parsing dodgy HTML and having HTMLParser choke is a common theme.
Haven't used it, but Beautiful Soup sounds like it fits the bill:
http://www.crummy.com/software/BeautifulSoup/
FuManChu
More information about the Python-list
mailing list