Web authentication urllib2

Gabriel dunmer at dreams.sk
Sat Jan 24 06:59:22 EST 2009


First, thank you both

I think this isn't basic auth, because this page has form login.
I read site's html source and used wireshark to analyze communication 
between my browser and website and i really find out that a was ignoring 
one field

I added it to the parameters but it didn't help..
Maybe i'm still missing something

Here's the post packet:
http://student.fiit.stuba.sk/~sevecek06/auth.txt

and here's the code again, with little change and real web location added:

	opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
	urllib2.install_opener(opener)

	params = urllib.urlencode(dict(login='login', pwd='pass', page=''))
	f = opener.open('https://www.orangeportal.sk/', params)
	data = f.read()
	f.close()

Login and pass are fake ofc.

Thank you in advice for any help.


Steve Holden wrote:
> Gabriel wrote:
>> Hello,
>>
>> I'm new in Python and i would like to write script which need to login
>> to a website. I'm experimenting with urllib2,
>> especially with something like this:
>>
>>     opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
>>     urllib2.install_opener(opener)
>>
>>     params = urllib.urlencode(dict(username='user', password='pass'))
>>     f = opener.open('https://web.com', params)
>>     data = f.read()
>>     f.close()
>>
>> And the problem is, that this code logs me in on some sites, but on
>> others doesn't, especially on the one I really
>> need to login. And i don't know why. So is there some way how to debug
>> this code and find out why that script cannot
>> login on that specific site?
>>
>> Sorry if this question is too lame, but i am really beginner both in
>> python and web programming .)
>>
> That's actually pretty good code for a newcomer! There are a couple of
> issues you may be running into.
> 
> First, not all sites use "application-based" authentication - they may
> use HTTP authentication of some kind instead. In that case you have to
> pass the username and password as a part of the HTTP headers. Michael
> Foord has done a fair write-up of the issues at
> 
>   http://www.voidspace.org.uk/python/articles/authentication.shtml
> 
> and you will do well to read that if, indeed, you need to do basic
> authentication.
> 
> Second, if it *is* the web application that's doing the authentication
> in the sites that are failing (in other words if the credentials are
> passed in a web form) then your code may need adjusting to use other
> field names, or to include other data as required by the login form. You
> can usually find out what's required by reading the HTML source of the
> page that contains the login form.
> 
> Thirdly [nobody expects the Spanish Inquisition ...], it may be that
> some sites are extraordinarily sensitive to programmed login attempts
> (possible due to spam), typically using a check of the "Agent:" HTTP
> header to "make sure" that the login attempt is coming from a browser
> and not a program. For sites like these you may need to emulate a
> browser response more fully.
> 
> You can use a program like Wireshark to analyze the network traffic,
> though you can get add-ons for Firefox that will show you the HTTP
> headers on request and response.
> 
> regards
>  Steve




More information about the Python-list mailing list