printing a list with non-ascii strings

Arnaud Delobelle arnodel at gmail.com
Thu Jan 20 14:51:48 EST 2011


Helmut Jarausch <jarausch at skynet.be> writes:

> Hi,
>
> I don't understand Python's behaviour when printing a list.
> The following example uses 2 German non-ascii characters.
>
> #!/usr/bin/python
> # _*_ coding: latin1 _*_
> L=["abc","süß","def"]
> print L[1],L
>
> The output of L[1] is correct, while the output of L shows up as
>  ['abc', 's\xfc\xdf', 'def']
>
> How can this be changed?
>
> Thanks for hint,
> Helmut.

That's because when you print a list, the code executed is roughly:

    print "[" + ", ".join(repr(x) for x in L) + "]"

Now try:

    print repr("süß")

I don't think this can be changed in Python 2.X.  I vaguely remember
discussions about this issue for Python 3 I think, but I can't remember
the outcome and it is different anyway as Python 3 strings are not the
same as Python 2 strings (they are the same as Python 2 unicode strings).

The issue though is that the python interpreter doesn't know what
encoding is supposed to be used for a string - a string in Python 2.X is
a sequence of bytes. If you print the string, then the terminal encodes
the bytes according to its settings, which has nothing to do with python
- so the appearance will differ according to the locale configuration of
the terminal.  However, the repr() of a string needs to be consistent
irrespective of the configuration of the terminal - so the only viable
option is to use nothing but ASCII characters.  Hence the difference.

HTH

-- 
Arnaud



More information about the Python-list mailing list