A try with WebScraping using Python

Raji Seetharaman sraji.me at gmail.com
Fri Dec 11 13:24:06 EST 2009


Hi

>From the tutorial found on the net i came to know about WebScraping using
Python.

I thought to give a try with it.

My wish is to extract the contact mail  id's from all the posts published
till now in the below link

http://fossjobs.wordpress.com/

With Firebug add-on its easy  to find the location of mail id's inside HTML
DOM tree.

I dont know how to download all the web pages i.e., the coding part

Which library i can use to download ? ( mechanize or windmill )

Help me

Thanks

Raji. S
http://sraji.wordpress.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20091211/cbde5edd/attachment-0001.html>


More information about the Python-list mailing list