[Web-SIG] Are both htmllib and HTMLParser needed?

Fred Drake fdrake at gmail.com
Wed Feb 20 16:43:26 CET 2008


On Feb 20, 2008 9:35 AM, Guido van Rossum <guido at python.org> wrote:
> ISTR that HTMLParser was the preferred one. It is certainly newer, and
> doesn't carry the baggage of sgmllib which I would discard together
> with htmllib). Maybe Fred Drake remembers (he's listed as the
> co-author on the initial checkin message).

I was thinking I'd said something on the stdlib-sig list, but I can't
find it in the archive, so I must be having a senior moment (brought
on early by kids).

I'd be in favor of keeping only HTMLParser, with a compliant module
name ("htmlparser" doesn't seem unreasonable).  The code was
originally derived from htmllib for the Grail webbrowser, mostly to
make things like attribute handling less painful.

Merging _markupbase into HTMLParser to create htmlparser would be
pretty mechanical.  Removing sgmllib and htmllib does not depend on
that, and can be done at any time if there's agreement.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller


More information about the Web-SIG mailing list