Long way around UnicodeDecodeError, or 'ascii' codec can't decode byte

Paul Boddie paul at boddie.org.uk
Thu Mar 29 08:53:07 EDT 2007


On 29 Mar, 06:26, "Oleg  Parashchenko" <ole... at gmail.com> wrote:
> Hello,
>
> I'm working on an unicode-aware application. I like to use "print" to
> debug programs, but in this case it was nightmare. The most popular
> result of "print" was:
>
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xXX in position
> 0: ordinal not in range(128)

What does sys.stdout.encoding say?

> I spent two hours fixing it, and I hope it's done. The solution is one
> of the ugliest hack I ever written, but it solves the pain. The full
> story and the code is in my blog:
>
> http://uucode.com/blog/2007/03/23/shut-up-you-dummy-7-bit-python/

Calling sys.setdefaultencoding might not even help in this case, and
the consensus is that it may be harmful to your code's portability
[1]. Writing output to a terminal may be influenced by your locale,
but I'm not convinced that going through all the locale settings and
setting the character set is the best approach (or even the right
one).

What do you get if you do this...?

import locale
locale.setlocale(locale.LC_ALL, "")
print locale.getlocale()

What is your terminal encoding?

Usually, if I'm wanting to print Unicode objects, I explicitly encode
them into something I know the terminal will support. The codecs
module can help with writing Unicode to streams in different
encodings, too.

Paul

[1] http://groups.google.com/group/comp.lang.python/msg/431017a4cb4bb8ea




More information about the Python-list mailing list