difference between urllib2.urlopen and firefox view 'page source'?

Steve Holden steve at holdenweb.com
Mon Mar 19 23:25:28 EDT 2007


cjl wrote:
> Hi.
> 
> I am trying to screen scrape some stock data from yahoo, so I am
> trying to use urllib2 to retrieve the html and beautiful soup for the
> parsing.
> 
> Maybe (most likely) I am doing something wrong, but when I use
> urllib2.urlopen to fetch a page, and when I view 'page source' of the
> exact same URL in firefox, I am seeing slight differences in the raw
> html.
> 
> Do I need to set a browser agent so yahoo thinks urllib2 is firefox?
> Is yahoo detecting that urllib2 doesn't process javascript, and
> passing different data?
> 
It's almost certainly a browser detection issue. This may not matter for 
your application.

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC/Ltd          http://www.holdenweb.com
Skype: holdenweb     http://del.icio.us/steve.holden
Recent Ramblings       http://holdenweb.blogspot.com




More information about the Python-list mailing list