[issue1755841] Patch for [ 735515 ] urllib2 should cache 301 redir

John J Lee report at bugs.python.org
Wed Jan 6 00:29:12 CET 2010


John J Lee <jjlee at users.sourceforge.net> added the comment:

To make sure I understood something Antoine said:  By "per-request", I assume you mean the same kind of thing as the current use of .redirect_dict -- the multiple urllib2.Request instances that may result from a single request passed by the user to .open()/urlopen all sharing the same cache state.

In addition to what Antoine said:

 0. patch reports that your latest patch is malformed (see below).
 1. I'm afraid I think any 301 caching that's not per-request should be off by default.  Defaulting to on would be a significant change in behaviour, because urllib2.urlopen (and OpenerDirector.open) currently retains no state between calls (unless you add a handler that keeps state, such as HTTPCookieProcessor, but no such handlers are added by default).
 2. I imagine the code changes should be entirely (or almost entirely) confined to RedirectHandler and/or AbstractHTTPHandler.  Is there any justification for changing OpenerDirector?  Certainly no need to add any globals!
 3. http_error_30x is a documented interface, but it's not frequently used.  The argument(s) used to control caching should be somewhere else (see questions below).
 4. Please do post the doc changes for review once the implementation is decided on.


Some questions to consider:

 a. How should POST requests be handled when there is a cached permanent redirect URI?
 b. How useful is per-request caching, in the sense defined above (as opposed to per-user agent caching -- i.e. per-handler in our case)?  Best answered with data from the web.
 c. Should URIs be normalised before being used as a cache key?
 d. Might the cache get big?

For all #a, #b, #c, #d: What do existing implementations (e.g. Firefox) do?


$ patch -p0 < urllib2-301-redirection-proper.diff
patching file Lib/urllib2.py
Hunk #3 succeeded at 549 with fuzz 2 (offset 18 lines).
Hunk #4 succeeded at 562 (offset 18 lines).
patch: **** malformed patch at line 55: @@ -604,8 +618,12 @@

$ patch --version
patch 2.5.9
Copyright (C) 1988 Larry Wall
Copyright (C) 2003 Free Software Foundation, Inc.

This program comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of this program
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.

written by Larry Wall and Paul Eggert

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1755841>
_______________________________________


More information about the Python-bugs-list mailing list