Parsing HTML

Ben Last ben at benlast.com
Thu Sep 23 03:11:15 EDT 2004


There are several HTML parsers, but many (including me) speak well of
Beautiful Soup; it doesn't try to check for correctness or do any
validation, it just parses the HTML.
http://www.crummy.com/software/BeautifulSoup/
ben

> -----Original Message-----
> From: python-list-bounces+ben=benlast.com at python.org
> [mailto:python-list-bounces+ben=benlast.com at python.org]On Behalf Of
> Anders Eriksson
> Sent: 23 September 2004 07:42
> To: python-list at python.org
> Subject: Parsing HTML
>
>
> Hello!
>
> I want to extract some info from a some specific HTML pages, Microsofts
> International Word list (e.g.
> http://msdn.microsoft.com/library/en-us/dnwue/html/swe_word_list.htm). I
> want to take all the words, both English and the other language and create
> a dictionary. so that I can look up About and get Om as the answer.
>
> How is the best way to do this?
>
> Please help!
>
> // Anders
> --
> http://mail.python.org/mailman/listinfo/python-list




More information about the Python-list mailing list