[BangPypers] HTML Parsing in python

S.Ramaswamy srsy70 at gmail.com
Thu Sep 10 15:35:51 CEST 2009


On Thu, Sep 10, 2009 at 2:29 PM, Puneet Aggarwal <look4puneet at gmail.com>wrote:

> Hi BangPypers,
>
> Can anyone suggest me a good library for html parsing in python ?
> I googled a found few libararies BeautifulSoup, HTMLParser, SGMLParser
> etc.
>
> Can anyone suggest me which should I go for from your experience.
>
>  Recent versions of BeautifulSoup are awfully slow. I had to switch from
3.1.0 to an older 3.0.7a for an app that I wrote recently. The author
explains the issues here:
http://www.crummy.com/software/BeautifulSoup/3.1-problems.html . I'm
sticking with it since I am used to it; but if you are starting fresh it,
makes sense to explore other libs.

Ramaswamy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/bangpypers/attachments/20090910/1d6eedb1/attachment.htm>


More information about the BangPypers mailing list