unicode bit me

Scott David Daniels Scott.Daniels at Acm.Org
Sun May 10 02:19:21 EDT 2009


anuraguniyal at yahoo.com wrote:
> class A(object):
>     def __unicode__(self):
>         return u"©au"
>     def __repr__(self):
>         return unicode(self).encode("utf-8")
>     __str__ = __repr__
> a = A()
> u1 = unicode(a)
> u2 = unicode([a])
> 
> now I am not using print so that doesn't matter stdout can print
> unicode or not
> my naive question is line u2 = unicode([a]) throws
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position
> 1: ordinal not in range(128)
> 
> shouldn't list class call unicode on its elements? 
 > I was expecting that so instead do i had to do this
 > u3 = "["+u",".join(map(unicode,[a]))+"]"

Why would you expect that?  str([a]) doesn't call str on its elements.
Using our simple expedient:
     class B(object):
         def __unicode__(self):
             return u'unicode'
         def __repr__(self):
             return 'repr'
         def __str__(self):
             return 'str'
     >>> unicode(B())
     u'unicode'
     >>> unicode([B()])
     u'[repr]'
     >>> str(B())
     'str'
     >>> str([B()])
     '[repr]'

Now if you ask _why_ call repr on its elements,
the answer is, "so that the following is not deceptive:

     >>> repr(["a, b", "c"])
     "['a, b', 'c']"
which does not look like a 3-element list.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list