difference between urllib2.urlopen and firefox view 'page source'?

cjl cjlesh at gmail.com
Wed Mar 21 11:16:39 EDT 2007


Group:

Thank you for all the informative replies, they have helped me figure
things out. Next up is learning beautiful soup.

Thank you for the code example, but I am trying to learn how to
'screen scrape', because Yahoo does make historical stock data
available using the CSV format, but they do not do this for stock
options, which is what I am ultimately attempting to scrap.

Here is what I have so far, I know how broken and ugly it is:

import urllib2, sys
from BeautifulSoup import BeautifulSoup

page = urllib2.urlopen("http://finance.yahoo.com/q/op?s=" +
sys.argv[1])
soup = BeautifulSoup(page)
print soup.find("table",{"id" :"yfncsubtit"}).big.b.contents[0]

This actually works, and will print out the current stock price for
whatever ticker symbol you supply as the command line argument when
you launch this script. Later I will add error checking, etc.

Any advice on how I am using beautiful soup in the above code?

thanks again,
cjl




More information about the Python-list mailing list