Web page data and urllib2.urlopen

ryles rylesny at gmail.com
Thu Aug 6 12:20:28 EDT 2009


On Aug 5, 4:30 pm, Massi <massi_... at msn.com> wrote:
> Hi everyone, I'm using the urllib2 library to get the html source code
> of web pages. In general it works great, but I'm having to do with a
> financial web site which does not provide the souce code I expect. As
> a matter of fact if you try:
>
> import urllib2
> res = urllib2.urlopen("http://www.marketwatch.com/story/mondays-
> biggest-gaining-and-declining-stocks-2009-07-27")
> page = res.read()
> print page
>
> you will see that the printed code is very different from the one
> given, for example, by mozilla. Since I have really little knowledge
> in html I can't even understand if this is a python or html problem.
> Can anyone give me some help?
> Thanks in advance.

Check if setting your user agent to Mozilla results in a different
page:

http://diveintopython.org/http_web_services/user_agent.html



More information about the Python-list mailing list