Bug in Python 2.6 urlencode

John Nagle nagle at animats.com
Tue Sep 7 23:21:37 EDT 2010


On 9/7/2010 5:43 PM, Terry Reedy wrote:
> On 9/7/2010 3:02 PM, John Nagle wrote:
>> There's a bug in Python 2.6's "urllib.urlencode". If you pass
>> in a Unicode character outside the ASCII range, instead of it
>> being encoded properly, an exception is raised.
>>
>> File "C:\python26\lib\urllib.py", line 1267, in urlencode
>> v = quote_plus(str(v))
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in
>> position 0: ordinal not in range(128)
>>
>> This will probably work in 3.x, because there, "str" converts
>> to Unicode, and quote_plus can handle Unicode. This is one of
>> those legacy bugs left from the pre-Unicode era.
>>
>> There's a workaround. Call urllib.urlencode with a second
>> parameter of 1. This turns on the optional feature of
>> accepting tuples in the argument to be encoded, and the
>> code goes through a newer code path that works.
>>
>> Is it worth reporting 2.x bugs any more? Or are we in the
>> version suckage period, where version N is abandonware and
>> version N+1 isn't deployable yet.
>
> You may report 2.7 bugs, but please verify that the behavior is a bug in
> 2.7. However, bugs that have been fixed by the switch to switch to
> unicode for text are unlikely to be fixed a second time in 2.7. You
> might suggest an enhancement to the doc for urlencode if that workaround
> is not clear. Or perhaps that workaround suggests that in this case, a
> fix would not be too difficult, and you can supply a patch.
>
> The basic deployment problem is that people who want to use unicode text
> also want to use libraries that have not been ported to use unicode
> text. That is the major issue for many porting projects.

     In other words, we're in the version suckage period.

					John Nagle





More information about the Python-list mailing list