url filtering

Dennis Benzinger Dennis.Benzinger at gmx.net
Sun Dec 17 14:24:16 EST 2006


Am Sun, 17 Dec 2006 20:14:32 +0100
schrieb vertigo <spam at spam.pl>:

> Hello
> 
> I want to do some text analysis based on html documents grabbed from  
> internet.
> Is there any library which could allow me easily getting text from
> html documents
> (cutting javascript, html tags and other not nececary data) ?
> 
> Thanx

Try Beautiful Soup: http://www.crummy.com/software/BeautifulSoup/


Dennis



More information about the Python-list mailing list