[issue2464] urllib2 can't handle http://www.wikispaces.com

Senthil report at bugs.python.org
Sun Aug 17 04:42:19 CEST 2008


Senthil <orsenthil at gmail.com> added the comment:

Ah, I that was a simple fix. :) I very much overlooked the problem after
being so much given the hints at the web-sig.

I have some comments on the patch, Facundo.
1) I don't think is a good idea to include that portion in the
http_error_302 method. That makes the fix "very" specific to "this"
issue only. 
Another point is, fixing broken url's should not be under urllib2,
urlparse would be a better place.
So, I came up with the approach wherein urllib2 does unparse(parse) of
the url and parse methods will fix the url if it is broken. ( See
attached   issue2464-PATCH1.diff)

But if we handle it in the urlparse methods, then we are much
susceptible to breaking RFC conformance, breaking a lot of tests, Which
is not a  good idea.

So,I introduced fix_broken() method in urlparse and called it to solve
the issue, using the same logic as yours (issue2464-py26-FINAL.diff)
With fix_broken() method in urlparse, we will have better control
whenever we want to implement a behavior which is RFC non-confirming but
implemented widely by browsers and clients.

All tests pass with issue2464-py26-FINAL.diff

Comments,please?

Added file: http://bugs.python.org/file11132/issue2464-py26-FINAL.diff

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue2464>
_______________________________________


More information about the Python-bugs-list mailing list