HTML Parser for Jython

Tim Chase python.list at tim.thechases.com
Wed Oct 17 11:58:32 EDT 2007


> Does anybody know of a decent HTML parser for Jython? I have to do
> some screen scraping, and would rather use a tested module instead of
> rolling my own.

GIYF[0][1]
There are the batteries-included HTMLParser[2] and htmllib[3] 
modules, and the ever-popular (and more developer-friendly) 
BeautifulSoup[4] library as the first three results.  For running 
BS in Jython, it's recommended[5] to use an older (1.x?) version 
which are available at the BS site[6]

-tkc

[0]http://en.wikipedia.org/wiki/GIYF#G
[1]http://www.google.com/search?q=python%20html%20parser
[2]http://docs.python.org/lib/module-HTMLParser.html
[3]http://docs.python.org/lib/module-htmllib.html
[4]http://www.crummy.com/software/BeautifulSoup/
[5]http://mail.python.org/pipermail/python-list/2007-May/439618.html
[6]http://www.crummy.com/software/BeautifulSoup/download/1.x/











More information about the Python-list mailing list