Thanks for all responses

Wolfgang Meiners WolfgangMeiners01 at web.de
Tue May 31 15:52:24 EDT 2011


I think it helped me very much to understand the problem.

So if i deal with nonascii strings, i have a 'list of bytes' and need an
encoding to interpret this list and transform it to a meaningful unicode
string. Decoding does the opposite.

Whenever i 'cross the border' of my program, i have to encode the 'list
of bytes' to an unicode string or decode the unicode string to a 'list
of bytes' which is meaningful to the world outside.

So encode early, decode lately means, to do it as near to the border as
possible and to encode/decode i need a coding system, for example 'utf8'

That means, there should be an encoding/decoding possibility to every
interface i can use: files, stdin, stdout, stderr, gui (should be the
most important ones).

While trying to understand this, i wrote the following program. Maybe
someone can give me a hint, how to print correctly:

######################################################
#! python
# -*- coding: utf-8 -*-

class EncTest:
    def __init__(self,Name=None):
        self.Name=unicode(Name, encoding='utf8')

    def __repr__(self):
        return u'My name is %s' % self.Name

if __name__ == '__main__':

    a = EncTest('Müller')

    # this does work
    print a.__repr__()

    # throws an error if default encoding is ascii
    # but works if default encoding is utf8
    print a

    # throws an error because a is not a string
    print unicode(a, encoding='utf8')
######################################################

Wolfgang




More information about the Python-list mailing list