[XML-SIG] xml / html parsing for webbot

Alexandre Fayolle Alexandre.Fayolle@logilab.fr
Sun, 10 Dec 2000 14:21:37 +0100 (CET)


> For that purpose, the DOM authors made special support for HTML. You
> normally need a special parser, one that is capable of processing
> HTML, and still building a DOM tree. PyXML now includes 4DOM, which, I
> believe, is capable of converting arbitrary HTML into a DOM tree.

Logilab contributed a much improved version of FromHtml to 4DOM a while
ago which was included in 4Suite 0.9.2 I think. I don't know which version
is shipped in PyXml 0.6.2, though. If you need this piece of code, and
can't find it in your distribution, jsut ask.


Alexandre Fayolle
-- 
http://www.logilab.com 
Narval is the first software agent available as free software (GPL).
LOGILAB, Paris (France).