[ python-Bugs-1313119 ] urlparse "caches" parses regardless of encoding

Sat Jan 13 20:25:10 CET 2007

Bugs item #1313119, was opened at 2005-10-04 19:57
Message generated for change (Comment added) made by lemburg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313119&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Unicode
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Ken Kinder (kkinder)
>Assigned to: Nobody/Anonymous (nobody)
Summary: urlparse "caches" parses regardless of encoding

Initial Comment:
The issue can be summarized with this code:

>>> urlparse.urlparse(u'http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')
>>> urlparse.urlparse('http://www.python.org/doc')
(u'http', u'www.python.org', u'/doc', '', '', '')

Once the urlparse library has "cached" a URL, it stores
the resulting value of that cache regardless of
datatype. Notice that in the second use of urlparse, I
passed it a STRING and got back a UNICODE object.

This can be quite confusing when, as a developer, you
think you've already encoded all your objects, you use
urlparse, and all of a sudden you have unicode objects
again, when you expected to have strings.

----------------------------------------------------------------------

>Comment By: M.-A. Lemburg (lemburg)
Date: 2007-01-13 20:25

Message:
Logged In: YES 
user_id=38388
Originator: NO

Unassigning: I don't use urlparse, so can't comment.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1313119&group_id=5470