WTF? Printing unicode strings

Serge Orlov Serge.Orlov at gmail.com
Fri May 19 07:28:49 EDT 2006


Serge Orlov wrote:
> Ron Garret wrote:
> > In article <1148001708.183506.296240 at g10g2000cwb.googlegroups.com>,
> >  "Serge Orlov" <Serge.Orlov at gmail.com> wrote:
> >
> > > Ron Garret wrote:
> > > > > > I'm using an OS X terminal to ssh to a Linux machine.
> > > > >
> > > > > In theory it should work out of the box. OS X terminal should set
> > > > > enviromental variable LANG=en_US.utf-8, then ssh should transfer this
> > > > > variable to Linux and python will know that your terminal is utf-8.
> > > > > Unfortunately AFAIK OS X terminal doesn't set that variable and most
> > > > > (all?) ssh clients don't transfer it between machines. As a workaround
> > > > > you can set that variable on linux yourself . This should work in the
> > > > > command line right away:
> > > > >
> > > > > LANG=en_US.utf-8 python -c "print unichr(0xbd)"
> > > > >
> > > > > Or put the following line in ~/.bashrc and logout/login
> > > > >
> > > > > export LANG=en_US.utf-8
> > > >
> > > > No joy.
> > > >
> > > > ron at www01:~$ LANG=en_US.utf-8 python -c "print unichr(0xbd)"
> > > > Traceback (most recent call last):
> > > >   File "<string>", line 1, in ?
> > > > UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in
> > > > position 0: ordinal not in range(128)
> > > > ron at www01:~$
> > >
> > > What version of python and what shell do you run? What the following
> > > commands print:
> > >
> > > python -V
> > > echo $SHELL
> > > $SHELL --version
> >
> > ron at www01:~$ python -V
> > Python 2.3.4
> > ron at www01:~$ echo $SHELL
> > /bin/bash
> > ron at www01:~$ $SHELL --version
> > GNU bash, version 2.05b.0(1)-release (i386-pc-linux-gnu)
> > Copyright (C) 2002 Free Software Foundation, Inc.
> > ron at www01:~$
>
> That's recent enough. I guess the distribution you're using set LC_*
> variables for no good reason. Either unset all enviromental variables
> starting with LC_ and set LANG variable or overide LC_CTYPE variable:
>
> LC_CTYPE=en_US.utf-8 python -c "print unichr(0xbd)"
>
> Should be working now :)

I've pulled myself together and installed linux in vwware player.
Apparently there is another way linux distributors can screw up. I
chose debian 3.1 minimal network install and after answering all
installation questions I found that only ascii and latin-1 english
locales were installed:
$ locale -a
C
en_US
en_US.iso88591
POSIX

In 2006, I would expect utf-8 english locale to be present even in
minimal install. I had to edit /etc/locale.gen and run locale-gen as
root. After that python started to print unicode characters.




More information about the Python-list mailing list