[XML-SIG] Ideas for web/ package

Nicolas Chauvat Nicolas.Chauvat@logilab.fr
Sat, 16 Feb 2002 12:21:06 +0100 (CET)


> > Can anyone think of other things that could be part of this package?
> 
> The usual bunch of HTML tweaking functions, e.g. fast escaping,
> unescaping, finding certain parts within the page (in a non-parsing
> way, since this often breaks with todays HTML hackery),
> link checker, link finder, etc.

At some point Logilab contributed an HTML{Reader,Builder}? to PyXML/4Suite
that would parse an HTML document and build a DOM tree, infering the
missing tags when needed (it knew the DTD). You could then use XSLT to
extract the kind of information listed above. And XSLT is a very good fit
for the task. I am not sure what eventually happened to that piece of code
though, as we have not been needing it for some time and I couldn't find
it in my current _xmlplus directory... I guess I could dig it out of our
CVS if someone is interested.

-- 
Nicolas Chauvat

http://www.logilab.com - "Mais oł est donc Ornicar ?" - LOGILAB, Paris (France)