urllib2/cookies - surely there's a better way ?

Richard Shea richardshea at fastmail.fm
Fri Apr 30 02:48:36 EDT 2004


Hi - I'm writing a script which fetches a page from a web server and
takes note of any set-cookies which are served in the headers so that
when I next request a page I can send those cookies back to the
server. This is so that the usage analysis software on the server
(based on cookies) will take account of the scripts activities.

Now the thing is I'm halfway through doing this but I'm thinking there
must be a more refined mechanism than the one I'm using (see below).

I'm not really asking if there's a way to smarten up the rather clunky
splits (although if necessary that would be welcome) I'm more asking
is there not a more refined interface to the whole area of cookies.

Strangely enought the doco says that f.info() "return the
meta-information of the page, as a dictionary-like object" - well as
far as I can see it's a string and the f.info().headers is a list. I
don't usually find errors in the doco so this makes me wonder if
there's something I'm doing fundamentally wrong ?

Anyway any ideas would be welcome. Here goes with the work in progress
...
 

import urllib2
from string import split
from string import upper


req_headers = {
        'User-Agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows
NT)',
        'Referer':''
}

TARGETURL = 'http://www.somedomain.com/a/b/'

req = urllib2.Request(TARGETURL, None, req_headers)
f = urllib2.urlopen(req)
lstHeaders=f.info().headers

for h in lstHeaders:

        lstHeaderContents = split(h,":",1)
        if upper(lstHeaderContents[0]) == "SET-COOKIE":
                print lstHeaderContents[1]
                #get the keyword value pair to the left of the first
';'
                lstYetAnother = split(lstHeaderContents[1],";",1)
                #put the keyword value pair into a list
                lstOneMore = split(lstYetAnother[0],"=",1)
                print "Keyword=" + lstOneMore[0] + ". Value = " +
lstOneMore[1] + "."



That's the script - thanks for readind this far.

regards

richard.



More information about the Python-list mailing list