HTML Parsing and Indexing
Stefan Behnel
stefan.behnel-n05pAM at web.de
Tue Nov 14 02:50:57 EST 2006
mailtogops at gmail.com wrote:
> I am involved in one project which tends to collect news
> information published on selected, known web sites inthe format of
> HTML, RSS, etc and sortlist them and create a bookmark on our website
> for the news content(we will use django for web development). Currently
> this project is under heavy development.
>
> I need a help on HTML parser.
lxml includes an HTML parser which can parse straight from URLs.
http://codespeak.net/lxml/
http://cheeseshop.python.org/pypi/lxml
Stefan
More information about the Python-list
mailing list