Help extracting info from HTML source ..

Miki miki.tebeka at gmail.com
Fri Jan 26 08:45:18 EST 2007


Hello Shelton,

>   I am learning Python, and have never worked with HTML.  However, I would
> like to write a simple script to audit my 100+ Netware servers via their web
> portal.
Always use the right tool, BeautilfulSoup
(http://www.crummy.com/software/BeautifulSoup/) is best for web
scraping (IMO).

from urllib import urlopen
from BeautifulSoup import BeautifulSoup

html = urlopen("http://www.python.org").read()
soup = BeautifulSoup(html)
for link in soup("a"):
	print link["href"], "-->", link.contents

HTH,
--
Miki
http://pythonwise.blogspot.com/




More information about the Python-list mailing list