Auto Logon to site and get page

John Smith jsmith5616 at yahoo.com
Sat Jan 31 18:10:45 EST 2009


I'm trying to automatically log into a site and store the resulting html using python. The site uses a form and encrypts the password with some kind of md5 hash.

This is the important parts of the form:

<script language="JavaScript" src="/admin/javascript/md5.js"></script>
<script language="JavaScript"><!--
var pskey = "770F11B12EBB7D15058170FA3AD12E685D3A46112B841B1E7BE375F600E62705";
//-->
</script>

<form name="LoginForm" action="/guardian/home.html" method="POST" target="_top" onsubmit="doStudentLogin(this);">

<input type="hidden" name="pstoken" value="3895">
<input type="text" name="account" value="" size="35">
<input type="password" name="pw" value="" size="35">
<input type="image" src="/images/btn_enter.gif" width="89" height="27" border="0" alt="Enter">

</form>

This is the function called in md5.js:
function doStudentLogin(form)
{
var pw = form.pw.value;
var pw2 = pw; // Save a copy of the password preserving case
pw = pw.toLowerCase();
form.pw.value = hex_hmac_md5(pskey, pw);
if (form.ldappassword!=null) {
// LDAP is enabled, so send the clear-text password
// Customers should have SSL enabled if they are using LDAP
form.ldappassword.value = pw2; // Send the unmangled password
}
return true;
}
 
I am not sure what the ldappassword is or does. Can some one explain that?
 
Here's my code :
 
from urllib import urlopen, urlencode
import re
import hmac
 
account = 'account'
psw = 'my password''
url = "http://ps.pvcsd.org/guardian/home.html"
 
homepagetxt = urlopen("http://ps.pvcsd.org").read()
 
# get key and pstoken from login page
m = re.search('<input type="hidden" name="pstoken" value="(?P<id>[0-9]+)"', homepagetxt)
token = m.group('id')
 
m = re.search('var pskey = "(?P<id>[a-zA-Z0-9]+)"', homepagetxt)
key = m.group('id')
 
hobj = hmac.new(key, psw)
psw = hobj.hexdigest()         # encrypt the password
 
data = { 'pstoken' : token, 'account' : account, 'pw' : psw }
e = urlencode(data)
 
page = urlopen(url, e)
txt = page.read()
 
f = open("text.html", 'w')
f.write(txt)
f.close()
 
This doesn't however, it just sends me back to the main login page, doenst say invalid password or anything. I've checked, yes the python hmac hash function produces the same results (encrypted password) as the md5.js file. Does anyone know what I am doing wrong??


      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20090131/a0daaedc/attachment.html>


More information about the Python-list mailing list