difference between urllib2.urlopen and firefox view 'page source'?

cjl cjlesh at gmail.com
Mon Mar 19 22:30:38 EDT 2007


Hi.

I am trying to screen scrape some stock data from yahoo, so I am
trying to use urllib2 to retrieve the html and beautiful soup for the
parsing.

Maybe (most likely) I am doing something wrong, but when I use
urllib2.urlopen to fetch a page, and when I view 'page source' of the
exact same URL in firefox, I am seeing slight differences in the raw
html.

Do I need to set a browser agent so yahoo thinks urllib2 is firefox?
Is yahoo detecting that urllib2 doesn't process javascript, and
passing different data?

-cjl




More information about the Python-list mailing list