LANG, locale, unicode, setup.py and Debian packaging

"Martin v. Löwis" martin at v.loewis.de
Sat Jan 12 18:08:42 EST 2008


> 2. If this returns "C" or anything without 'utf8' in it, then things start
> to go downhill:
>  2a. The app assumes unicode objects internally. i.e. Whenever there is
> a "string  like this" in a var it's supposed to be unicode. Whenever
> something comes into the app (from a filename, a file's contents, the
> command-line) it's assumed to be a byte-string that I decode("utf8") on
> before placing it into my objects etc.

That's a bug in the app. It shouldn't assume that environment variables
are UTF-8. Instead, it should assume that they are in the locale's
encoding, and compute that encoding with locale.getpreferredencoding.

>  2b. Because of 2a and if the locale is not 'utf8 aware' (i.e. "C") I start
> getting all the old 'ascii' unicode decode errors. This happens at every
> string operation, at every print command and is almost impossible to fix.

If you print non-ASCII strings to the terminal, and you can't be certain
that the terminal supports the encoding in the string, and you can't
reasonably deal with the exceptions, you should accept moji-bake, by
specifying the "replace" error handler when converting strings to the
terminal's encoding.

> 3. I made the decision to check the locale and stop the app if the return
> from getlocale is (None,None). 

I would avoid locale.getlocale. It's a pointless function (IMO).

Also, what's the purpose of this test?

> Does anyone have some ideas? Is there a universal "proper" locale that we
> could set a system to *before* the Debian build stuff starts? What would
> that be - en_US.utf8?

Your program definitely, absolutely must work in the C locale. Of
course, you cannot have any non-ASCII characters in that locale, so
deal with it.

If you have solved that, chances are high that it will work in other
locales as well (but be sure to try Turkish, as that gives a
surprising meaning to "I".lower()).

Regards,
Martin



More information about the Python-list mailing list