Encoding problems

"Martin v. Löwis" martin at v.loewis.de
Thu Sep 2 16:43:55 EDT 2004


Gandalf wrote:
> This is a fault of the win32 console - it defaults to a different 
> encoding than other parts of the Windows system.
> This is messy but we cannot do anything about it. :-(

It's better than you think. Python, starting with 2.3, will do the
right thing for

# -*- coding: cp1252 -*-
print u"néz"

It determines that this is a Windows console, determines its encoding,
and converts the Unicode string to that encoding. Of course, this
requires the string to be a Unicode literal. So you'ld expect that

bildschirm = raw_input(u"néz")

works, but unfortunately, it doesn't, as raw_input does not support
Unicode. However, the encoding Python has determined is available
as sys.stdout.encoding, so you can do

bildschirm = raw_input(u"néz".encode(sys.stdout.encoding))

This works even if the user has done chcp in the window, as Python
queries the window what its encoding is, during Python startup.

HTH,
Martin



More information about the Python-list mailing list