[issue3300] urllib.quote and unquote - Unicode issues

Guido van Rossum report at bugs.python.org
Wed Aug 13 19:17:20 CEST 2008


Guido van Rossum <guido at python.org> added the comment:

> Bill Janssen <bill.janssen at gmail.com> added the comment:
>
> Erik van der Poel at Google has now chimed in with stats on current URL
> usage:
>
> ``...the bottom line is that escaped non-utf-8 is still quite prevalent,
> enough (in my opinion) to require an implementation in Python, possibly
> even allowing for different encodings in the path and query parts (e.g.
> utf-8 path and gb2312 query).''
>
> http://lists.w3.org/Archives/Public/www-international/2008JulSep/0042.html
>
> I think it's worth remembering that a very large proportion of the use
> of Python's urllib.unquote() is in implementations of Web server
> frameworks of one sort or another.  We can't control what the browsers
> that talk to such frameworks produce; the IETF doesn't control that,
> either.  In this case, "practicality beats purity" is the clarion call
> of the browser designers, and we'd better be able to support them.

I think we're supporting these sufficiently by allowing developers to
override the encoding and errors value. I see no argument here against
having a default encoding of UTF-8.

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3300>
_______________________________________


More information about the Python-bugs-list mailing list