Encoding confusion, please help

"Martin v. Löwis" martin at v.loewis.de
Sun Nov 14 13:49:31 EST 2004


Pekka Niiranen wrote:
>  >>> sys.getdefaultencoding()
> 'iso-8859-1'

This is already troublesome; it means somebody (perhaps you)
has tampered with your Python installation. The default system
encoding is ascii, and it should not be changed unless
absolutely necessary.

> When should I use locale.getpreferredencoding() and when
> sys.getdefaultencoding()?

There should never be a need to probe sys.getdefaultencoding(),
as it should always be ascii.

locale.getpreferredencoding() should be used when converting
Unicode strings to and from byte strings to be stored on the local
system (e.g. in files). Notice that this may or may not be adequate
also when printing data to the terminal. Specifically, on Windows,
the terminal often uses yet another encoding.

> Why two different encodings 'cp1252' and 'iso-8859-1' are provided
> for my Windows 2000 system?

Python provides many more encodings, including UTF-8, KOI-8R,
ISO-8859-2, cp1250, and so on. Having many codecs available in
the library is a good thing, because different applications have
different needs.

I somehow feel this doesn't answer your question, but then, I don't
fully understand the question.

Regards,
Martin



More information about the Python-list mailing list