Any equivalent to Ruby's 'hpricot' html/xpath/css selector package?

Stefan Behnel stefan_ml at behnel.de
Tue Dec 30 08:28:37 EST 2008


Kenneth McDonald wrote:
> Ruby has a package called 'hpricot' which can perform limited xpath
> queries, and CSS selector queries. However, what makes it really useful
> is that it does a good job of handling the "broken" html that is so
> commonly found on the web. Does Python have anything similar, i.e.
> something that will not only do XPath queries, but will do so on
> imperfect HTML?

lxml.html is your friend.

http://codespeak.net/lxml/lxmlhtml.html

Stefan



More information about the Python-list mailing list