Trying to understand this moji-bake

Steven D'Aprano steve+comp.lang.python at pearwood.info
Fri Jan 24 23:37:34 EST 2014


I have an unexpected display error when dealing with Unicode strings, and 
I cannot understand where the error is occurring. I suspect it's not 
actually a Python issue, but I thought I'd ask here to start.

Using Python 3.3, if I print a unicode string from the command line, it 
displays correctly. I'm using the KDE 3.5 Konsole application, with the 
encoding set to the default (which ought to be UTF-8, I believe, although 
I'm not completely sure). This displays correctly:

[steve at ando ~]$ python3.3 -c "print(u'ñøλπйж')"
ñøλπйж


Likewise for Python 3.2:

[steve at ando ~]$ python3.2 -c "print('ñøλπйж')"
ñøλπйж


But using Python 2.7, I get a really bad case of moji-bake:

[steve at ando ~]$ python2.7 -c "print u'ñøλπйж'"
ñøλÏйж


However, interactively it works fine:

[steve at ando ~]$ python2.7 -E
Python 2.7.2 (default, May 18 2012, 18:25:10)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-52)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print u'ñøλπйж'
ñøλπйж


This occurs on at least two different machines, one using Centos and the 
other Debian.

Anyone have any idea what's going on? I can replicate the display error 
using Python 3 like this:

py> s = 'ñøλπйж'
py> print(s.encode('utf-8').decode('latin-1'))
ñøλÏйж

but I'm not sure why it's happening at the command line. Anyone have any 
ideas?



-- 
Steven



More information about the Python-list mailing list