urllib and persistence

Mark Carter cartermark46 at ukmail.com
Sun Mar 23 05:05:15 EST 2003


I have been using python to download and process files from the
internet
using the urllib to great effect:
f = urllib.urlopen(url)
txt = f.read()
#process txt

I now find that I want to download files from a site that requires a
user ID and password -
which I have. The problem is, though, that when I use urllib to
download
the URL I'm interested in, the page that is returned to me is the
login page.
If I were using Internet Explorer, then I could login, and view the
URL I
was interested in. So, it is apparent that the connection between
Internet Explorer
and the server site has some kind of 
"persistence" to it. 

What I need to do, therefore, is understand what is going on, in order
to get
around the login problem.

Actually, I'm not insistent that the downloads are done in python.
What I
really need is a way of logging in, and then performing a bulk
download
from a list of URLs that I have (which could be stored in a text file,
for
example).




More information about the Python-list mailing list