Extract Title from HTML documents

Anakim Border aborder at users.sourceforge.net
Thu Nov 4 16:39:47 EST 2004


You may find BeautifulSoup (http://www.crummy.com/software/BeautifulSoup/) 
useful.

from BeautifulSoup import BeautifulSoup
b = BeautifulSoup()
b.feed(file('test.html').read())
print b.first('title').renderContents()

HTH

-- 
 Anakim Border
 http://pydc.sourceforge.net
 aborder at users.sourceforge.net



More information about the Python-list mailing list