Help with Parsing HTML files

Hernan M. Foffani hfoffani at yahoo.com
Thu Aug 2 16:00:05 EDT 2001


Charlie Clark <charlie at begeistert.org> wrote:
> .......
> What's the best way to go about parsing the HTML? I've looked at sgmllib
> and htmllib and am a bit lost. The worst thing for me about Python's
> documentation is it's lack of examples. I leafed through all the Python
> books in the bookshop today but failed to find much inspiration. One of
> the problems I'll admit to having is not being able to work out how to
> use a class simply by reading it's code - it just doesn't work for me
> :-((
> .........

Did you read the test sample part that's included in that sgmllib.py?
I mean the "class TestSGMLParser(SGMLParser):"
Some other samples that come within Python distribution:
  .../Tools/webchecker/webchecker.py
  .../test/test_htmlparser.py
  .../test/test_sgmllib.py
Or pickup http://www.orgmf.com.ar/condor/pythlp.py  :-P

But better yet, go back to the bookstore and buy Fredrik Lundh's
Python Standard Library book. You won't regret.
(for this particular problem look pages 132-142)

Regards,
-Hernan

-- 
-------------------- http://NewsReader.Com/ --------------------
                    Usenet Newsgroup Service



More information about the Python-list mailing list