[XML-SIG] ElementTree Tidy HTML Tree Builder and comments

Björn Lindström bkhl at elektrubadur.se
Sat Mar 19 04:41:23 CET 2005


I'm considering using the ElementTree Tidy HTML Tree Builder for a web
spidering program I'm developing.

However, my program must be able to extract certain information from
HTML comments.

I'm basically creating my trees like this:

TidyHTMLTreeBuilder.parse(urllib.urlopen(url))

What I want to know is, is it possible to make TidyHTMLTreeBuilder
preserve comments in this process, and if so, how would I go
about it?



More information about the XML-SIG mailing list