[I18n-sig] Perhaps the locale should matter?

Fredrik Lundh Fredrik Lundh" <effbot@telia.com
Sun, 7 May 2000 17:16:13 +0200


Guido van Rossum wrote:
> I wonder if we could make the default conversion from 8-bit to Unicode
> depend on the locale?  This would be a compromise between my ASCII
> proposal and the Latin-1 proposal.  My reasoning is that the locale is
> an existing Python feature.  Code that is broken when the locale
> differs from the default has been broken for a long time.  We might
> not *like* a global setting for this kind of feature, but: "We've
> already got one!"  [Imitates thick French accent.]

well, I was going to suggest that we take that one away
in 1.7...

    "Avoidance of locales is strongly encouraged."
    (from the Perl unicode docs)

> If the program explicitly set the locale, it is a clear signal that it
> is interesting in manipulating characters in a particular locale, and
> we might as well honor this.

no time to elaborate, but here's what my (yet unpublished)
"how to handle strings in 1.7" proposal says:

-- "narrow" strings should assume unicode, and use unicode
   aware replacements for the ctype operations (isspace, is-
   digit, etc).

-- the locale should not control conversions between "narrow"
   and wide character strings.

-- the locale should be used to install codecs on standard I/O
   streams and on the system API's (e.g. filenames), on Unix
   platforms (and compatibles).

that is, the Unix locale is reduced to being a platform specific
way to tell Python what language/locale we're running under
(the "dot charset" notation plus some simple heuristics is used
to determine a default character set).

for other platforms, use the platform specific mechanisms for
this (active code page, character set used by system font,
etc).

more later.

</F>