html source

Kartic removethis.kartic.krishnamurthy at gmail.com
Tue Feb 14 20:56:29 EST 2006


Hi Steve (Young),

Here is my take. It is possible that the web page you are accessing 
dynamically generates the page using the user-agent.

The user-agent when used from urllib2 will be set to Python-urllib/x.x. 
If the page were generated dynamically, this would go into the "else" 
part (of the page-generation logic) and yield a page without all the 
fancy scripts.

Why don't you try setting the user-agent to same as the browser you are 
using and see if you get the same HTML source this time.

Refer http://diveintopython.org/http_web_services/user_agent.html on 
setting the user-agent.

Thanks,
--Kartic

The Great 'Steve Holden' uttered these words on 2/13/2006 4:53 PM:
> Steve Young wrote:
> 
>> Hi, I was wondering why when I use urllib2.build_opener().open(url), 
>> it doesn't give me the same thing as if I would just click on view--> 
>> source on my web browser. It gives me most of html on the page but 
>> leaves out lots of scripts and some of the link's urls are truncated. 
>> Is there something out there in python that gives me EXACTALLY the 
>> same thing as if you were to just do view-->source on the web browser? 
>> Thanks for the help.
>>
> If this observation is truly correct it should qualify as a bug. Can you 
> give us a URL and some code which demonstrate your assertion?
> 
> regards
>  Steve



More information about the Python-list mailing list