[I18n-sig] Perhaps the locale should matter?
Fredrik Lundh
Fredrik Lundh" <effbot@telia.com
Sun, 7 May 2000 17:16:13 +0200
Guido van Rossum wrote:
> I wonder if we could make the default conversion from 8-bit to Unicode
> depend on the locale? This would be a compromise between my ASCII
> proposal and the Latin-1 proposal. My reasoning is that the locale is
> an existing Python feature. Code that is broken when the locale
> differs from the default has been broken for a long time. We might
> not *like* a global setting for this kind of feature, but: "We've
> already got one!" [Imitates thick French accent.]
well, I was going to suggest that we take that one away
in 1.7...
"Avoidance of locales is strongly encouraged."
(from the Perl unicode docs)
> If the program explicitly set the locale, it is a clear signal that it
> is interesting in manipulating characters in a particular locale, and
> we might as well honor this.
no time to elaborate, but here's what my (yet unpublished)
"how to handle strings in 1.7" proposal says:
-- "narrow" strings should assume unicode, and use unicode
aware replacements for the ctype operations (isspace, is-
digit, etc).
-- the locale should not control conversions between "narrow"
and wide character strings.
-- the locale should be used to install codecs on standard I/O
streams and on the system API's (e.g. filenames), on Unix
platforms (and compatibles).
that is, the Unix locale is reduced to being a platform specific
way to tell Python what language/locale we're running under
(the "dot charset" notation plus some simple heuristics is used
to determine a default character set).
for other platforms, use the platform specific mechanisms for
this (active code page, character set used by system font,
etc).
more later.
</F>