Problem with __str__ method and character encoding

Chris Angelico rosuav at gmail.com
Fri Dec 7 09:29:06 EST 2012


On Sat, Dec 8, 2012 at 1:14 AM, gialloporpora <gialloporpora at gmail.com> wrote:
> Dear all,
> I have a problem with character encoding.
> I have created my class and I have redefined the __str__ method for pretty
> printing.  I have saved my file as test.py,
> I give these lines:
>
>>>> from test import *
>>>> a = msgmarker("why", u"perché", 0)
>>>> print a
> UnicodeError
>>>> print a.__str__()
> OK

Your __str__ method is not returning a string. It's returning a
Unicode object. Under Python 2 (which you're obviously using, since
you use print as a statement), strings are bytes. The best thing to do
would be to move to Python 3.3, in which the default string type is
Unicode, and there's a separate 'bytes' type for communicating with
file systems and networks and such. But if that's not possible, I
would recommend having a separate method for returning a Unicode
string (the same as your current __str__ but with a different name),
and have __str__ call that and encode it - something like this:

def describe(self):
    return u'msgid: "%s"\nmsgstr: "%s"' %(self.msgid, self.msgstr)
def __str__(self):
    return self.describe().encode(self._encoding)

But it'd definitely be better to move to Python 3.

ChrisA



More information about the Python-list mailing list