Logging into a website using urllib2

dmbkiwi at gmail.com dmbkiwi at gmail.com
Fri Jan 27 14:30:50 EST 2006


I've been using urllib2 to try and automate logging into the google
adsense page.  I want to download the csv report files, so that I can
do
some analysis of them.  However, I don't really know how web forms
work,
and the examples on the python.org/doc site aren't really helpful.

I've found working scripted login code using javascript, but I don't
speak
javascript.  These are the relevant functions:

function preLogin(){
        cookies = "";
        with (adSenseUrl){
                location =
"https://www.google.com/accounts/ServiceLogin?service=adsense&hl=en-US&ltmpl=login&ifr=true&passive=true&rm=hide&nui=3&alwf=true&continue=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth&followup=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth"
        }
        adSenseUrl.autoRedirect=false;
        adSenseUrl.fetchAsync(doLogin);

}

function doLogin(){
        var authkey = adSenseUrl.result.match(/GA3T"
value="([^"]*)"/)[1];
        oldadSenseUrl = adSenseUrl;
        adSenseUrl = new URL();
        adSenseUrl.setRequestHeader("Referer",oldadSenseUrl.location);
        preserveCookies(oldadSenseUrl,adSenseUrl);
        with (adSenseUrl){
                location =
"https://www.google.com/accounts/ServiceLoginAuth"
                postData =
"ltmpl=login&continue=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth&followup=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth&service=adsense&nui=3&ifr=true&rm=hide&ltmpl=login&hl=en-US&alwf=true&GA3T="+authkey+"&Email="+preferences.email.value+"&Passwd="+preferences.password.value+"&null=Sign+in"
        }
        adSenseUrl.autoRedirect = false;
        adSenseUrl.fetchAsync(processLogin);
}

function processLogin(){
        if (!adSenseUrl.result.match(/url = "([^"]*)"/)){
                statusText.data= statusTextShadow.data = "Unable to
login.";
                var fade8 = new FadeAnimation( spinner, 0, 500,
animator.kEaseInOut );
                animator.runUntilDone( new Array( fade8) );
                log("quitting because of failed login.");
                loading = false;
                return false;
        }

        var authkey = adSenseUrl.result.match(/url =
"([^"]*)"/)[1].replace('\\u003d',"=");
        oldadSenseUrl = adSenseUrl;
        adSenseUrl = new URL();
        adSenseUrl.setRequestHeader("Referer",oldadSenseUrl.location);
        preserveCookies(oldadSenseUrl,adSenseUrl);
        with(adSenseUrl){
                postData=""
                location=authkey
        }
        adSenseUrl.autoRedirect = false;
        adSenseUrl.fetchAsync(processConfirm);
}
function processConfirm(){
        oldadSenseUrl = adSenseUrl;
        adSenseUrl = new URL();
        adSenseUrl.setRequestHeader("Referer",oldadSenseUrl.location);
        preserveCookies(oldadSenseUrl,adSenseUrl);
        adSenseUrl.autoRedirect = true;
        adSenseUrl.location = "https://www.google.com" +
oldadSenseUrl.getResponseHeaders("Location");
        adSenseUrl.fetchAsync(processConfirm2);
}

function processConfirm2(){
        statusText.data= statusTextShadow.data = "loading data ...";

        if (adSenseUrl.response != 200){
                statusText.data= statusTextShadow.data = "Unable to
login";
                var fade8 = new FadeAnimation( spinner, 0, 500,
animator.kEaseInOut );
                animator.runUntilDone( new Array( fade8) );
                log("quitting because of bad status code.");
                loading = false;
                return false;
        }
        oldadSenseUrl = adSenseUrl;
        adSenseUrl = new URL();
        adSenseUrl.setRequestHeader("Referer",oldadSenseUrl.location);
        preserveCookies(oldadSenseUrl,adSenseUrl);
        with (adSenseUrl){
                postData = ""
                location =
"https://www.google.com/adsense/report/aggregate?product=afc&dateRange.dateRangeType=simple&dateRange.simpleDate=thismonth&reportType=property&groupByPref=date&csv=true&unitPref=page"
        }
        adSenseUrl.autoRedirect = true;
        adSenseUrl.fetchAsync(processAdsenseData);
}

I've started with:

>>> import urllib, urllib2, cookielib
>>> cj = cookielib.LWPCookieJar()
>>> opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
>>> urllib2.install_opener(opener)
>>> req = urllib2.Request('https://www.google.com/accounts/ServiceLogin?service=adsense&hl=en-US&ltmpl=login&ifr=true&passive=true&rm=hide&nui=3&alwf=true&continue=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth&followup=https%3A%2F%2Fwww.google.com%2Fadsense%2Fgaiaauth')
>>> handle = urllib2.urlopen(req)
>>> print cj
<_LWPCookieJar.LWPCookieJar[<Cookie GA3T=tiioBadqoko for
www.google.com/accounts>, <Cookie GoogleAccountsLocale_session=en for
www.google.com/accounts>]>

But I'm a bit stuck as to where to go from there.  Can anyone help out?

Thanks

Matt




More information about the Python-list mailing list