Bug in Python 2.6 urlencode

John Nagle nagle at animats.com
Tue Sep 7 15:02:07 EDT 2010


     There's a bug in Python 2.6's "urllib.urlencode".  If you pass
in a Unicode character outside the ASCII range, instead of it
being encoded properly, an exception is raised.

   File "C:\python26\lib\urllib.py", line 1267, in urlencode
     v = quote_plus(str(v))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa9' in 
position 0: ordinal not in range(128)

     This will probably work in 3.x, because there, "str" converts
to Unicode, and quote_plus can handle Unicode.  This is one of
those legacy bugs left from the pre-Unicode era.

     There's a workaround.  Call urllib.urlencode with a second
parameter of 1.  This turns on the optional feature of
accepting tuples in the argument to be encoded, and the
code goes through a newer code path that works.

     Is it worth reporting 2.x bugs any more?  Or are we in the
version suckage period, where version N is abandonware and
version N+1 isn't deployable yet.

   					John Nagle



More information about the Python-list mailing list