extracting title and/or summary of a website

alex23 wuwei23 at gmail.com
Wed May 21 22:41:17 EDT 2008


On May 22, 3:28 am, रवींदर ठाकुर (ravinder thakur)
<ravindertha... at gmail.com> wrote:
> is there any lib in python that provides a mechanism to get the title
> of a web page ? also is there anything available to get a nice summary
> like the way google shows below every link ?

It's not part of the standard lib but I really like using
BeautifulSoup for this kind of thing:

    from urllib import urlopen
    from BeautifulSoup import BeautifulSoup

    html = urlopen("http://www.google.com").read()
    soup = BeautifulSoup(html)

    print soup.title # '<title>Google</title>'
    print soup.title.renderContents() # 'Google'

http://www.crummy.com/software/BeautifulSoup/

- alex23



More information about the Python-list mailing list