"Decoding unicode is not supported" in unusual situation

Wed Mar 7 16:48:58 EST 2012

John Nagle <nagle at animats.com> writes:

>    The library bug, if any, is that you can't apply
>
> 	unicode(s, errors='replace')
>
> to a Unicode string. TypeError("Decoding unicode is not supported") is
> raised.  However
>
>   	unicode(s)
>
> will accept Unicode input.

I think that's a Python bug. If the latter succeeds as a no-op, the
former should also succeed as a no-op. Neither should ever get any
errors when ‘s’ is a ‘unicode’ object already.

> The Python documentation
> ("http://docs.python.org/library/functions.html#unicode") does not
> mention this. It is therefore necessary to check the type before
> calling "unicode", or catch the undocumented TypeError exception
> afterward.

Yes, this check should not be necessary; calling the ‘unicode’
constructor with an object that's already an instance of ‘unicode’
should just return the object as-is, IMO. It shouldn't matter that
you've specified how decoding errors are to be handled, because in that
case no decoding happens anyway.

Care to report that bug to <URL:http://bugs.python.org/>, John?

-- 
 \          “Those who write software only for pay should go hurt some |
  `\                 other field.” —Erik Naggum, in _gnu.misc.discuss_ |
_o__)                                                                  |
Ben Finney