unicode bit me

Piet van Oostrum piet at cs.uu.nl
Sat May 9 15:31:26 EDT 2009


>>>>> "Mark Tolonen" <metolone+gmane at gmail.com> (MT) wrote:

>MT> <anuraguniyal at yahoo.com> wrote in message
>MT> news:994147fb-cdf3-4c55-8dc5-62d769b12cdc at u9g2000pre.googlegroups.com...
>>> Sorry being unclear again, hmm I am becoming an expert in it.
>>> 
>>> I pasted that code as continuation of my old code at start
>>> i.e
>>> class A(object):
>>> def __unicode__(self):
>>> return u"©au"
>>> 
>>> def __repr__(self):
>>> return unicode(self).encode("utf-8")
>>> __str__ = __repr__
>>> 
>>> doesn't work means throws unicode error
>>> my question boils down to
>>> what is diff between, why one doesn't throws error and another does
>>> print unicode(a)
>>> vs
>>> print unicode([a])

>MT> That is still an incomplete example.  Your results depend on your source
>MT> code's encoding and your system's stdout encoding.  Assuming a=A(),
>MT> unicode(a) returns u'©au', but then is converted to stdout's encoding for
>MT> display.  

You are confusing the issue. It does not depend on the source code's
encoding (supposing that the encoding declaration in the source is
correct). repr returns unicode(self).encode("utf-8"), so it is utf-8
encoded even when the source code had a different encoding. The u"©au"
string is not dependent on the source encoding.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list