Using Python 2.1 to download asp www pages
Zugz
zugz.public at DEL-ete-MEbtinternet.com
Sun Jan 6 15:17:33 EST 2002
Hi,
I've recently written some Python code to extract some details about posting
frequency etc from a board I use regularly.
I used IE5.5's Save As to give me some pages to work on offline.
I would now like to automate the whole process by downloading all the
relevant pages or maybe even just accessing them direct.
If I use urlopen on a regular .htm page, in this case from the collection of
links I call my www site, then things work as you would expect. You get the
html source:
>>>
a=urllib.urlopen("http://www.zugz.btinternet.co.uk/NonSFBooksBookshops.htm")
>>> print a.read()
as you would hope.
However if I access one of the pages of interest, which all have the same
form as below but with the a varying last page number:
>>>
a=urllib.urlopen("http://boards.gamers.com/messages/overview.asp?name=panthe
r_xl&page=2")
>>> print a.read()
Then you do not get the page source but some HTML about the page being
moved.
So is this a function of it being an asp page and my luck is out or is there
a simple way to achieve what I wish anyway.
Thanks in advance for any help you may be able to give.
Regards,
Zugz.
---
Outgoing mail is certified Virus Free.
Checked by AVG anti-virus system (http://www.grisoft.com).
Version: 6.0.310 / Virus Database: 171 - Release Date: 19/12/01
More information about the Python-list
mailing list