Python strings outside the 128 range

Piet van Oostrum piet at cs.uu.nl
Mon Jul 17 07:25:56 EDT 2006


>>>>> Michael Piotrowski <mxp at dynalabs.de> (MP) wrote:

>MP> On 2006-07-14 "Diez B. Roggisch" <deets at nospam.web.de> wrote:
>>> Sybren Stuvel schrieb:
>>>> Diez B. Roggisch enlightened us with:
>>>>> Of course not. AFAIK there is no way figuring out which encoding the
>>>>> target console supports. The best you can do is to offer an option
>>>>> that allwos selection of the output encoding.
>>>> 
>>>> You can use the LANG environment variable on many systems. On mine,
>>>> it's set to en_GB.UTF-8, which causes a lot of software to
>>>> automatically choose the right encoding.
>>> 
>>> That might be a good heuristic - but on my Mac no LANG is set. So I
>>> should paraphrase my statement to "There is no reliable and
>>> cross-platform way figuring out which encoding the console uses".

>MP> If LANG is not set, it's equivalent to setting it to "C".  However,
>MP> you shouldn't look directly at these variables (LANG and LC_*) but
>MP> rather use the functions from the locale module, e.g.:

>MP>   import locale
>MP>   locale.setlocale(locale.LC_ALL, '') # use the current locale settings
>MP>   encoding = locale.nl_langinfo(locale.CODESET)

But if LANG isn't set (like on Mac OS X) this doesn't give you the proper
encoding.
On my system I have added LANG to .profile.
-- 
Piet van Oostrum <piet at cs.uu.nl>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org



More information about the Python-list mailing list