[XML-SIG] SAX and HTML

Dierk Höppner D.Hoeppner@tu-bs.de
Tue, 3 Aug 1999 14:22:04 +0200


Hello,

I want to use SAX to extract data from HTML. I began with 
modifying the example saxstats.py but it did not come very far 
because my html-sources are not well constructed xml-documents. 
Then I forced the parser to use drv_htmllib but this failed because 
HTMLParser of htmllib wants a formatter. drv_htmllib gives None 
which doesn't work of course.

Any hints what to do? Even RTFM ist welcome but please give a 
hint to a good page ;-)

greetings

Dierk Hoeppner

Braunschweig University Library
Pockelsstr. 13
D-38106 Braunschweig
Germany
Tel: +49-531-391-5066 Fax: -5836
E-Mail: d.hoeppner@tu-bs.de