[XML-SIG] SAX and HTML

Fred L. Drake Fred L. Drake, Jr." <fdrake@acm.org
Tue, 3 Aug 1999 09:41:22 -0400 (EDT)


Dierk Höppner writes:
 > I want to use SAX to extract data from HTML. I began with 
 > modifying the example saxstats.py but it did not come very far 
 > because my html-sources are not well constructed xml-documents. 
 > Then I forced the parser to use drv_htmllib but this failed because 
 > HTMLParser of htmllib wants a formatter. drv_htmllib gives None 
 > which doesn't work of course.

Dierk,
  Try changing drv_htmllib to use a formatter.NullFormatter instance.
Let us know how that works; if a simple fix to drv_htmllib does the
trick, I think we can do that!


  -Fred

--
Fred L. Drake, Jr.	     <fdrake@acm.org>
Corporation for National Research Initiatives