translating foreign data

Richard Damon Richard at Damon-Family.org
Sat Jun 23 08:34:00 EDT 2018


On 6/23/18 8:03 AM, Marko Rauhamaa wrote:
> Richard Damon <Richard at Damon-Family.org>:
>> If you know the Locale, then you do know what the decimal separator
>> is, as that is part of what a locale defines.
> I don't know what that sentence means.
When you set the locale
>
>> The issue is that if you just know the encoding, you don't necessarily
>> know the locale.
> I always know my locale. The locale is tied to the human user.
No, it should be tied to the data you are processing. If an English user
is feeding a program Chinese documents, while processing those documents
the program should be using the appropriate Chinese Locale. When
generating output to the user, it should switch (back) to the
appropriate English Locale (likely the system locale that the user set).
>
>> He also commented that he didn't want to set the locale in the
>> routine, as that sets it globally for the full application (but
>> perhaps that latter could be fixed by first doing a
>> locale.getlocale(), then setlocale for the files locale, and then at
>> the end of reading and processing restore back the old locale.
> Setting a locale application-wise is
>
>  * not in accordance with the idea of a locale (the locale should be
>    constant within a user session)
Again, no, a locale is tied to the data, not the user (unless you want
to require the user to translate all data to his locale conventions
(without using a program that can use locale information) before
providing it to a program. Yes, the default for the interpretation
should be the users default/current locale, but you really want them to
be able to say I got this file from someone whose locale was different
than mine.

Data presented to the user should normally use his locale (unless he has
specified something different).
>
>  * not easily possible (the locale is seen by all threads
>    simultaneously)
That is an implementation error. It should be possible to create a
thread specific locale, and it is really useful to create a local locale
that can be used by the various conversion operators to say for this
conversion use this specific locale as that is what this data indicated
how it is to be interpreted.
>
>
> BTW, I think the locale is a terrible invention.
>
>
> Marko

The locale is a lot better than the alternative, where every application
that needs to deal with internationalization need to recreate (and
debub) all of the mechanism. I agree it isn't perfect, and for small
simple programs it would be nice to be able to say "I don't want all
this stuff, make it go away".

Python took its locale (at least initially) from C, which was a single
global which does have more issues because of this. C++ objectified the
locale and allows the programmer to imbue a specific locale into
different parts of his program (in particular, each I/O Stream knows
what locale its data is to be processed with). Perhaps (maybe it has) it
could be good to adopt the object based locale concept of C++ (but that
does come at a significant cost for things like CPython) where streams
know their locale, and other operations can be optionally passed a
locale to use.

-- 
Richard Damon




More information about the Python-list mailing list