[Python-Dev] htmllib vs. HTMLParser

amk at amk.ca amk at amk.ca
Tue Oct 28 07:53:50 EST 2003


On Mon, Oct 27, 2003 at 04:53:32PM -0800, Bill Janssen wrote:
> But IMO simply adding some handler methods won't really do it.  You
> also need to introduce some knowledge about the semantics of the
> syntax.  For example, a new "block"-level element should close all
> "in-line" elements that are currently open.  Etc.

Perhaps, but it might be a mug's game.  I was on the Lynx developer list for
a while, and bad HTML requires many, many hacks to be processed sensibly.
Given that XHTML use is slowly rising, that work may not be necessary, but
I'll keep it in mind.

--amk



More information about the Python-Dev mailing list