Odd unicode() behavior
Fredrik Lundh
fredrik at pythonware.com
Wed Aug 30 07:02:09 EDT 2006
maport at googlemail.com wrote:
> The behavior of the unicode built-in function when given a unicode
> string seems a little odd to me:
>
>>>> unicode(u"abc")
> u'abc'
>
>>>> unicode(u"abc", "ascii")
> Traceback (most recent call last):
> File "<stdin>", line 1, in ?
> TypeError: decoding Unicode is not supported
>
> I don't see why providing the encoding should make the function behave
> differently when given a Unicode string. Surely unicode(s) ought to
> bahave exactly the same as unicode(s,sys.getdefaultencoding())?
nope.
if you omit the encoding argument, unicode() behaves pretty much like str(),
using either the __unicode__ method or __str__/__repr__ + decoding to get
a Unicode string.
see the language reference for details, e.g:
http://pyref.infogami.com/unicode
</F>
More information about the Python-list
mailing list