newbie can't get html grab script to run..pls help!!

Charlie Derr drivel_drool at bigfoot.com
Thu May 4 11:57:54 EDT 2000


#! /path/to/your/python

import urllib

s = "http://www.agu.org.au/club_directory/club_details.sps?id="

for i in range(1,41) :
	url = s + `i`
	urllib.urlretrieve(url, "club" + `i` + ".html")




that worked for me  -- note that those are backquotes, not single quotes


~ -----Original Message-----
~ From: python-list-admin at python.org
~ [mailto:python-list-admin at python.org]On Behalf Of Jason
~ Sent: Thursday, May 04, 2000 11:08 AM
~ To: python-list at python.org
~ Subject: newbie can't get html grab script to run..pls help!!
~
~
~ hi all
~ at the end of this post is a letter i sent to the BeOS newsgroup to get
~ an answer to my question...a very kind bloke sent back a script and told
~ me i should use python.....
~ ###start script
~
~ import httplib
~
~ db = httplib.HTTP("www.agu.org.au")
~ id_list = range(40)
~ for id in id_list:
~ 	db.putrequest("GET", "/club_details.sps?id=%d" % id)
~ 	errcode, errmsg, headers = db.getreply()
~ 	f = db.getfile()
~ 	data = f.read()
~ 	outfile = open("%d.html" % id, "w")
~ 	outfile.write(data)
~ 	outfile.close()
~
~ ### script done
~
~ ....as i was on my windows laptop i downloaded the windows version and
~ installed it...now this is where i get in trouble!
~ i cannot for the life of my figure out how to run this script..can
~ someone pls help!
~ i went to heaps of online sites to try to find out stuff but no
~ luck...one of the suggestions was that i had to do this at a dos prompt
~ PATH=%PATH%;"C:\Program Files\Python"
~ but all i get is a Too Many Parameters error!
~ i know php/c++/html/sql but this is driving me crazy!
~ all i want to do is to grab a few pages off a website and save them to
~ file...
~ http://www.agu.org.au/club_directory/club_details.sps?id=1
~ this is one of a series of the pages i want to grab....only the id number
~ changes..i want the first 40 ids.
~ i have also tried rebol (www.rebol.com) which was great but didn't want
~ to work for this example...?!
~
~ many thanks
~ jason
~ > i remember seeing a script on benews that queried a site for it's news
~ > headlines and saved them to a .html file for use on a website.....but...
~ > what i need to do, is to query a website database and grab out
~ particular
~ > entries (about 40)
~ > the site is done in asp and the database is queried by an id
~ number when
~ > you click on the link....so rather than clicking on 40
~ different links and
~ > saving each as a web page, i would like to have a script go to
~ the web page
~ > and then just move thru the id numbers that would be stated in
~ the script
~ > and just save those pages as .html files
~ > the only thing that changes on each web page is the id number
~ at the end
~ > and i think they have about 2000 listings but i only need a few...
~ > i hope i haven't rambled on too much.....i probably could have done it
~ > manually by the end of this email...anyway..good to learn more
~ scripting!!
~ > can someone pls help me
~ >
~ > many thanks
~ > Jason Savidge
~ > IT Manager
~ > One Stop Entertainment
~ > Brisbane Australia
~
~
~
~ --
~ http://www.python.org/mailman/listinfo/python-list
~





More information about the Python-list mailing list