[Tutor] Re: Web page capture

Schmidt, Allen J. aschmidt@nvcc.edu
Wed, 21 Aug 2002 09:43:05 -0400


Thanks to all who helped. I did some more digging and found parts which I
used to create this script. It works and dumps my data into a MySQL
database.


import urllib,MySQLdb,string,re,time

curtime=time.localtime(time.time())
fmt='%Y%m%d'
ourtimestamp=time.strftime(fmt,curtime)
class myURLOpener(urllib.FancyURLopener):
 	def setpasswd(self, user, passwd):
 		self.__user = user
 		self.__passwd = passwd
 	def prompt_user_passwd(self, host, realm):
 		return self.__user, self.__passwd

dbc = MySQLdb.connect(host="localhost", db="xxxx", passwd="xxxxx")  # here's
where we hook up to the database and get a cursor
crsr = dbc.cursor() 

urlopener = myURLOpener()
urlopener.setpasswd("username","mypassword")
fp = urlopener.open("http://localhost/foldername/pagename.htm")
report = fp.read()

sql = "insert into dailyreports (report_date, report_content) values
('"+ourtimestamp+"','"+report+"')"
x=crsr.execute(sql)
 
crsr.execute('select * from dailyreports')
queryresults=crsr.fetchall()
for entries in queryresults:
  report_id = entries[0]
  report_date = entries[1]
  report_content = entries[2]

 	

-----Original Message-----
From: tutor-admin@python.org [mailto:tutor-admin@python.org]On Behalf Of
Emile van Sebille
Sent: Wednesday, August 21, 2002 9:09 AM
To: tutor@python.org
Subject: [Tutor] Re: Web page capture


Schmidt, Allen J.
> Is it possible to use URLLIB (or something else) to mimic the process
of
> going to a URL, entering a userid and password and saving the
resulting page
> to a file or as a field in a database? I have looked a bit at URLLIB
but
> have not seen any way to enter the authentication stuff. Also how to
> actually save the html code itself to a file that can be called from a
DB
> and rendered later.
>

One way is to include the userid and passwd in the url:

tempfile, msg =
urllib.urlretrieve("http://userid:passwd@somewhere.com:8080/required/pag
e")

which then allows:

data = open(tempfile).read()

HTH,

--

Emile van Sebille
emile@fenx.com

---------




_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor