print u"\u0432": why is this so hard? UnciodeEncodeError

"Martin v. Löwis" martin at v.loewis.de
Thu Apr 8 00:54:33 EDT 2004


Nelson Minar wrote:
> I have a simple goal. I want the following Python program to work:
>   print u"\u0432"

As you have discovered, this is not so simple. Printing this character
might not be possible at all: If you have a terminal that just cannot
display CYRILLIC SMALL LETTER VE, then there is absolutely no way to
print the character - unless you change the terminal you use.

> Actually, I have a complex goal: I want my SOAPpy program to work when
> SOAPpy is in debug mode and is printing XML messages out to stdout.
> Solving the simple problem will solve the complex one. Since I'm using
> third party code, I can't go modify every print statement to call
> encode() explictly.

This shows the real source of the problem. SOAPpy should not print the
strings, but repr them. For debugging, repr is more reliable than str,
as it can render virtually every object.

> That seems to work reasonably well in Python 2.3 (but not 2.2!). But
> then for some obscure reason if I redirect stdout in my shell it fails.
>   $ LANG=en_US.UTF-8 python2.3 -c 'print u"\u0432"' > /dev/null
> 
> Why is that?

Python 2.3 discovers the encoding of your terminal, and will display
Unicode characters if the terminal supports them. Python 2.2 did not do
that, and the new feature is mainly useful in interactive mode.

When you redirect the output to a file, it is not a terminal anymore,
and Python cannot guess the encoding.

> The only solution I've found that really works is reassigning
> sys.stdout at the top of the script. That's an awful lot of work, but
> it's the best I can do for now.
> 
> Why is Python not respecting my locale?

It is: however, your locale only tells Python the encoding of your
terminal, not the encoding of an arbitrary file you may write to.

Assigning sys.stdout is the right thing to do. I'm uncertain why
that could be an awful lot of work, as you do this only once...

Regards,
Martin




More information about the Python-list mailing list