How to share session with IE

Cameron Walsh cameron.walsh at gmail.com
Tue Oct 10 21:18:03 EDT 2006


John J. Lee wrote:
> "Bernard" <bernard.chhun at gmail.com> writes:
>> zdp wrote:
> [...]
>>> However, now I need to process some pages by a python program. When I
>>> use urllib.urlopen(theurl), I can only get a page which told me I need
>>> login. I think It's reasonable, becuase I wasn't in a loggined session
>>> which as IE did.
>>>
>>> So how can I do my job? I want to get the right webpage by the url. I
>>> have search answers from the groups but didn't get clear answer. Should
>>> I use win32com or urllib? Any reply or information is appreciate. Hope
>>> I put it clear.
> 
>> You can do the same thing as IE on your forum using urllib2 and
>> cookielib. In short you need to code a small webcrawler. I can give you
>> my browser module if necessary.
>> You might not have the time to fiddle with the coding part or my
>> browser module so you can also use this particularly useful module :
>> http://wwwsearch.sourceforge.net/mechanize/
>> The documentation is pretty clear for an initiated python programmer.
>> If it's not your case, I'd recommend to read some ebooks on the python
>> language first to get use to it.
> 
> In particular, if you're following the approach Bernard suggests, you
> can either:
> 
> 1. Log in every time your program runs, by going through the sequence
>    of clicks, pages, etc. that you would use in a browser to log in.
> 
> 2. Once only (or once a month, or whatever), log in by hand using IE
>    with a "Remember me"-style feature (if the website offers that) --
>    where the webapp asks the browser to save the cookie rather than
>    just keeping it in memory until you close your browser.  Then your
>    program can load the cookies from your real browser's cookie store
>    using this:
> 
> http://wwwsearch.sourceforge.net/mechanize/doc.html#browsers
> 
> 
> There are other alternatives too, but they depend on knowing a little
> bit more about how cookies and web apps work, and may or may not work
> depending on what exactly the server does.  I'm thinking specifically
> here of saving *session* cookies (the kind that usually go away when
> you close your browser) in a file -- but the server may not like them
> when you send them back the next time, depending how much time has
> elapsed since the last run.  Of course, you can always detect the
> "need to login" condition, and react accordingly.
> 
> 
> John
> 


Another option instead of making your program run through a series of 
clicks and text inputs, which is difficult to program, is to browse the 
html source until you find the name of the script that processes the 
login, and use python to request the page with the necessary form fields 
encoded in the request.  Request something like
http://www.targetsite.com/login.cgi?username=pyuser&password="fhqwhgads"
This format is not guaranteed to work, since the login script or server 
might only support one of GET and POST.  If this is the case, creating 
the request is slightly more involved and to be honest I haven't looked 
into how to do it.

Thereafter, you will have to pass the environment to every page request 
so the server can read the cookie.  Which brings me to question whether 
or not it is possible to do this manually once, export the environment 
variable to a file, and reload this file each time the program is run. 
Or to generate the cookie in the environment yourself.  Quite frankly 
any server application that allows the client to control whether or not 
they have logged in sucks, but I've seen a fair few that do.[citation 
required]

Cameron.



More information about the Python-list mailing list