[XML-SIG] SAX and HTML
Fred L. Drake
Fred L. Drake, Jr." <fdrake@acm.org
Tue, 3 Aug 1999 09:41:22 -0400 (EDT)
Dierk Höppner writes:
> I want to use SAX to extract data from HTML. I began with
> modifying the example saxstats.py but it did not come very far
> because my html-sources are not well constructed xml-documents.
> Then I forced the parser to use drv_htmllib but this failed because
> HTMLParser of htmllib wants a formatter. drv_htmllib gives None
> which doesn't work of course.
Dierk,
Try changing drv_htmllib to use a formatter.NullFormatter instance.
Let us know how that works; if a simple fix to drv_htmllib does the
trick, I think we can do that!
-Fred
--
Fred L. Drake, Jr. <fdrake@acm.org>
Corporation for National Research Initiatives