Bug in Python 2.6 urlencode

Terry Reedy tjreedy at udel.edu
Tue Sep 7 20:43:04 EDT 2010


On 9/7/2010 3:02 PM, John Nagle wrote:
>   There's a bug in Python 2.6's "urllib.urlencode".  If you pass
> in a Unicode character outside the ASCII range, instead of it
> being encoded properly, an exception is raised.
>
> File "C:\python26\lib\urllib.py", line 1267, in urlencode
> v = quote_plus(str(v))
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in
> position 0: ordinal not in range(128)
>
> This will probably work in 3.x, because there, "str" converts
> to Unicode, and quote_plus can handle Unicode. This is one of
> those legacy bugs left from the pre-Unicode era.
>
> There's a workaround. Call urllib.urlencode with a second
> parameter of 1. This turns on the optional feature of
> accepting tuples in the argument to be encoded, and the
> code goes through a newer code path that works.
>
> Is it worth reporting 2.x bugs any more? Or are we in the
> version suckage period, where version N is abandonware and
> version N+1 isn't deployable yet.

You may report 2.7 bugs, but please verify that the behavior is a bug in 
2.7. However, bugs that have been fixed by the switch to switch to 
unicode for text are unlikely to be fixed a second time in 2.7. You 
might suggest an enhancement to the doc for urlencode if that workaround 
is not clear. Or perhaps that workaround suggests that in this case, a 
fix would not be too difficult, and you can supply a patch.

The basic deployment problem is that people who want to use unicode text 
also want to use libraries that have not been ported to use unicode 
text. That is the major issue for many porting projects.

-- 
Terry Jan Reedy




More information about the Python-list mailing list