[Patches] [Patch #102364] Fix for119822: Allow Unicode in urllib

noreply@sourceforge.net noreply@sourceforge.net
Sun, 3 Dec 2000 10:31:25 -0800


Patch #102364 has been updated. 

Project: python
Category: library
Status: Closed
Submitted by: loewis
Assigned to : loewis
Summary: Fix for119822:  Allow Unicode in urllib

Follow-Ups:

Date: 2000-Nov-12 03:09
By: loewis

Comment:
This fixes the bug by explicitly converting Unicode strings to ASCII. I could find no indication in the RFCs that anything but ASCII is allowed in URLs.

The size of the patch primarily originates from the renaming of the type local variable to urltype, so that the builtin type function is available.
-------------------------------------------------------

Date: 2000-Nov-12 05:11
By: lemburg

Comment:
There are movements which want to add UTF-8 support to URLs.
I don't know if there already are RTFs on this, but since even
MS Explorer supports this, I guess the movement must be strong ;-)

It does seem to be the natural coice and DNS is moving in that
direction too.

-------------------------------------------------------

Date: 2000-Nov-13 00:41
By: loewis

Comment:
Looking at http://www.ietf.org/ids.by.wg/idn.html, there is hardly any consensus on how to do internationalized domain names.

As for Unicode in URLs, there is not even an Internet Draft proposing usage of UTF-8. So I'd propose to follow established standards, until drafts for new standards come available.
-------------------------------------------------------

Date: 2000-Nov-13 11:55
By: gvanrossum

Comment:
Looks good.  Suggestion: maybe another name than 'toASCII()', since this may change to something else later? How about to8bit()?
-------------------------------------------------------

Date: 2000-Dec-03 10:31
By: loewis

Comment:
Committed as urllib.py 1.108.
-------------------------------------------------------

-------------------------------------------------------
For more info, visit:

http://sourceforge.net/patch/?func=detailpatch&patch_id=102364&group_id=5470