Retrieve HTML from site using cookies

onceuponapriori at gmail.com onceuponapriori at gmail.com
Thu Sep 21 13:58:00 EDT 2006


Greetings gents. I'm a Railser working on a django app that needs to do
some scraping to gather its data.

I need to programatically access a site that requires a username and
password. Once I post to the login.php page, there seems to be a
redirect and it seems that the site is using a session (perhaps a
cookie) to determine whether the user is logged in. So I need to log in
and then have cookies and or sessions maintained as I access the page
that contains the content that I am actually interested in.

What is the simplest way to post data to a form, accept a cookie to
maintain the session (and support redirects) and then (now logged into
the site) retrieve the content of a page on the site?

Is their a library or technique that makes this simple?

Thanks!




More information about the Python-list mailing list