[ python-Feature Requests-841728 ] urllib and cookie module
improvements
SourceForge.net
noreply at sourceforge.net
Tue Dec 9 17:59:26 EST 2003
Feature Requests item #841728, was opened at 2003-11-13 20:56
Message generated for change (Comment added) made by jjlee
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=841728&group_id=5470
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: paul rubin (phr)
Assigned to: Nobody/Anonymous (nobody)
Summary: urllib and cookie module improvements
Initial Comment:
1. The Cookie module should do a better job parsing real-
world cookies (the stuff that comes from http servers
following Set-cookie: headers) and should also have a
documented way to emit a client-side cookie (i.e.
generate a correct Cookie: header from a cookie
object).
2. Urllib or urllib2 should be enhanced to read incoming
cookie headers and send back the appropriate cookies in
the event of an HTTP redirect. Many sites set a cookie
then redirect to some other location which tries to read
the cookie; if the cookie isn't there, the new location
bounces back to the original one to set the cookie, so
you get a redirection loop.
3. The scheme of having urllib.urlopen() return the http
headers in a dictionary-like object doesn't quite work:
for example, there can be several Set-cookie headers in
a single http response. I don't know if the opener
currently combines them or discards some; neither way
is really satisfactory. There really should be a list for
each header type, but that would mess up the existing
published interface, so maybe a new 'urllib3' is needed.
I'm just starting to explore this stuff but it seems to me
like a serious urllib module needs to do quite a bit more
than the existing ones do. The Perl LWP documentation
might be a good place to look for inspiration.
----------------------------------------------------------------------
Comment By: John J Lee (jjlee)
Date: 2003-12-09 22:59
Message:
Logged In: YES
user_id=261020
Hmm, on 3., it's true that there is no documented way of
getting at multiple headers (and in fact, at the moment the
object returned by urlopen(url).info() is a subclass of
mimetools.Message, which is deprecated,
so .getallmatchingheaders() might well disappear soon).
CVS rev 1.57 of httplib attempted to fix this (bug 432621), but
the solution (making headers available joined with commas) is
not sufficient, thanks to the nonstandard behaviour of
Set-Cookie headers (Netscape cookie values may contain
unquoted commas, in violation of RFC 2616).
I suppose in future, HTTP response objects will be
implemented using email.Message objects (since mimetools is
deprecated), so it seems reasonable to add and document
a .get_all(hdr_name) method to httplib.HTTPMessage (perhaps
by going ahead and reimplementing it using email.Message).
I'll put it on my list to write a patch.
----------------------------------------------------------------------
Comment By: John J Lee (jjlee)
Date: 2003-12-03 19:08
Message:
Logged In: YES
user_id=261020
1. and 2. are dealt with in another tracker items, 3. is incorrect,
so this should be closed.
It's better if you submit different issues separately if possible.
1. See http://wwwsearch.sf.net/ClientCookie. It doesn't use the
Cookie module, since the code in the two modules is almost
disjoint, and it would just obfuscate ClientCookie, really. (Oh,
it's Paul Rubin... I see from your recent c.l.py message you've
just noticed this module :-)
2. As for 1.
I'm working on getting ClientCookie into a state suitable for the
standard library. See also patches 852995, which makes it
possible to implement cookie handling in a urllib2 handler, and
548197, which is somebody else's old cookie-handling patch.
3. You can already get the separate headers too:
response.info().getallmatchingheaders("Set-Cookie").
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=841728&group_id=5470
More information about the Python-bugs-list
mailing list