cookielib incorrectly escapes cookie

John J. Lee jjlee at reportlab.com
Sun Jul 9 12:39:26 EDT 2006


"BJörn Lindqvist" <bjourne at gmail.com> writes:

> I have some very serious trouble getting cookes to work. After a lot
> of work (urllib2 is severly underdocumented, arcane and overengineerd
> btw) I'm finally able to accept cookes from a server. But I'm still

And a good day to you too ;-)

In passing, there's a new HOWTO document on urllib2 here, which you
may find helpful:

http://svn.python.org/view/python/trunk/Doc/howto/urllib2.rst?rev=46062&view=markup


Doesn't seem to be part of the build process yet, so not available yet
in nicely-formatted HTML form on the python.org website -- I guess
it's included in HTML format in 2.5 beta1, though.

Note that that document is substantially rewritten over the version
that was originally on Michael's web site, from which the HOWTO
originally came.  I haven't checked the version on Michael's website
has been updated recently, so use the version linked to above instead.


> unable to return them to a server. Specifically the script im trying
> to do logs on to a server, get a session cookie and then tries to
> access a secure page using the same session cookie. But the cookie
> header cookielib produces is very different from the header it
> received.

Well (sigh), I didn't make all that up, you know ;-) Believe it or
not, that's what's supposed to happen if you send Version=1 cookies
(though few browsers ever supported it).  In case it's your own
server, I should note that I don't know of any reason for an internet
server ever to send Version=1 cookies, given what the majority of
browsers actually do.  However, since the cookie protocols (plural)
are, in practice, ill-defined (which is no individual's fault,
really), cases that work in popular browsers should usually be fixed.

Please test to make sure your problem goes away with Python 2.5 beta1:
I believe this bug is already fixed.  Please do try it though: it's
unlikely that anybody else has tested the fix.  I think beta2 is due
on Wednesday 12th, so it's advisable to get in quick if you want this
to work in 2.5 (please Cc: me personally to let me know whether it
works for you).

Note that it should work for you in Python 2.5 if and only if (not
rfc2965 or rfc2109_as_netscape) is true, where rfc2109_as_netscape and
rfc2965 are constructor arguments of DefaultCookiePolicy.  To
understand why (on some level, anyway), read the in-development docs
for DefaultCookiePolicy here:

http://docs.python.org/dev/lib/module-cookielib.html


Thanks for the report.

If you'd like a better workaround than the one you have for older
Pythons, I'll be happy to post one if you'll test this with 2.5 (no
good deed goes unpunished ;-)


[...]
> # Here is where it doesn't work unless the hack is applied. The cookie
> # header that is sent without the hack looks like this:
> #
> #   Cookie: $Version=1; SessionId=\"66b908e5025d93ed\"; $Path="/"
> #
> # It is not accepted by the server, probably because the SessionID
> # string is wrong.

There is a bug here, I think: I think the quoting is indeed incorrect,
but probably not for the reason you expect (also, on a separate point,
the funny-looking $Version and $Path are at least strictly correct,
and for example my copy of the "lynx" browser does send them).  I
won't try to explain the details here.

Since the fix would likely be complicated and risky, and of benefit
only in very unusual circumstances, I don't intend to fix it at this
stage of the Python release process.  It will not affect you when
using Python 2.5, as long as (not rfc2965 or rfc2109_as_netscape) is
true (see above for the definition of those names).  That's true by
default in 2.5, so all you should need is:

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor())
opener.open("http://www.example.com/")


(Unless you want to get at the CookieJar, e.g. to load and save
cookies), in which case go ahead and override the default CookieJar by
passing one to the HTTPCookieProcessor as you do in the code you
posted.)

I also note that you're adding an HTTPCookieProcessor, *and* also
calling .add_cookie_header().  HTTPCookieProcessor's job is to call
.add_cookie_header() / .extract_cookies() for you (even on redirects,
where you never get the opportunity to do it "manually").  You never
need to call those functions yourself if using urllib2.

HTH!


John



More information about the Python-list mailing list