Using Python 2.1 to download asp www pages

Sun Jan 6 23:14:39 EST 2002

"Zugz" <zugz.public at DEL-ete-MEbtinternet.com> writes:
> However if I access one of the pages of interest, which all have the same
> form as below but with the a varying last page number:
> 
> >>>
> a=urllib.urlopen("http://boards.gamers.com/messages/overview.asp?name=panthe
> r_xl&page=2")
> >>> print a.read()
> 
> Then you do not get the page source but some HTML about the page being
> moved.
> 
> So is this a function of it being an asp page and my luck is out or is there
> a simple way to achieve what I wish anyway.

It's just a matter of how the server side application works.

If you got back an HTTP 3xx redirection response then you have to just
have your client read the locator from it and open the new location.

If you get back an HTML META HTTP-EQUIV tag with a redirection that's
more or less the same thing and you have to check for that too.

It gets more annoying if the page sends javascript that tries to
navigate to another page.

Basically, what you're trying to do (programmatically interpret pages
that were intended for human interaction) is a pain in the neck and
you will find yourself having to manually examine the html in the
target pages and tweak your client for those specific pages, and
continue tweaking it as the page author changes the format over time.