[Tutor] retrieving httponly cookies on accessing webpage with urllib2

xbmuncher xboxmuncher at gmail.com
Fri Oct 17 05:40:30 CEST 2008


I'm trying to mimic my firefox browser in requesting a webpage with python.
Here are the headers obtained by wireshark when I accessed it with firefox:
GET /dirName/ HTTP/1.1
Host: www.website.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3)
Gecko/2008092417 Firefox/3.0.3
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive


the website responds with this header:
HTTP/1.1 200 OK
Date: Fri, 17 Oct 2008 03:16:19 GMT
Server: Apache/2.0.59 (FreeBSD) PHP/4.4.7 with Suhosin-Patch
X-Powered-By: PHP/4.4.7
Set-Cookie: bbsessionhash=1c9eacae7c56fefc79e627b07a9af8ae; path=/; HttpOnly
Set-Cookie: bblastvisit=1224613379; expires=Sat, 17 Oct 2009 03:16:19 GMT;
path=/
Set-Cookie: bblastactivity=0; expires=Sat, 17 Oct 2009 03:16:19 GMT; path=/
Cache-Control: private
Pragma: private
X-UA-Compatible: IE=7
Content-Encoding: gzip
Content-Length: 7099
Keep-Alive: timeout=15, max=100
Connection: Keep-Alive
Content-Type: text/html; charset=ISO-8859-1



So I tried trusty ol' urllib2 to request it in python:
import urllib2


url = 'http://www.website.com'

#headers
h = {
'User-Agent' : 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.3)
Gecko/2008092417 Firefox/3.0.3',
'Accept' :
'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language' : 'en-us,en;q=0.5',
'Accept-Encoding' : 'gzip,deflate',
'Accept-Charset' : 'ISO-8859-1,utf-8;q=0.7,*;q=0.7',
'Keep-Alive' : '300',
'Connection' : 'keep-alive'
}
#request page
reqObj = urllib2.Request(url, None, h)
urlObj = urllib2.urlopen(reqObj)

#read response
print urlObj.read()
print urlObj.geturl()
print urlObj.info()

#close urlObj
urlObj.close()

raw_input('press a key...')


it returns these headers:
Date: Fri, 17 Oct 2008 03:39:20 GMT
Server: Apache/2.0.59 (FreeBSD) PHP/4.4.7 with Suhosin-Patch
X-Powered-By: PHP/4.4.7
Content-Length: 1311
Connection: close
Content-Type: text/html


Notice the content length is considerably smaller, and no cookies are sent
to me like they were in firefox. I know only a little about httpOnly
cookies, but that it is some kind of special cookie that I suppose has
something to do with python not being able to access it like firefox. All I
want to do is have python receive the same cookies that firefox did, how can
I do this? I read somewhere that httpOnly cookies were implemented in the
python cookie module:
http://glyphobet.net/blog/blurb/285
....yet the other cookies aren't being sent either...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/tutor/attachments/20081016/11fc7d5a/attachment.htm>


More information about the Tutor mailing list