translating foreign data

Richard Damon richard.damon at 1
Sun Jun 24 16:47:54 EDT 2018


From: Richard Damon <Richard at Damon-Family.org>

On 6/23/18 10:44 PM, Steven D'Aprano wrote:
> On Sat, 23 Jun 2018 17:52:55 -0400, Richard Damon wrote:
>
>> If you have more than just a number representing a value in the locale
>> currency, you can't ask the locale how to present/accept it.
> You're the only one saying that it has to be handled by the locale.
>
>
Actually, it was part of the problem statement by Marko, since he said to use
LC_MONETARY, which is the part of the Locale machinery dealing with monetary
quantities (and can ONLY handle the currency defined by the Locale). What would
 you think of providing a program in say, Java, to a problem statement that
said to write a Python program.

I suppose he could have just meant use the number, which would be like asking
to interpret the value of 100 euros using math.pi

Or it could have been just a bad question like how heavy is blue. (Since by
definition a locale only knows how to handle a single type of currency,
assuming any value is of that type).

My answer was in part to point out the problem with the problem statement (and
people seem to want to jump on me for pointing out the strengths and weaknesses
 of the locale system.

This also goes back to the very original question at the beginning of the
thread, the OP had a bunch of data with numbers using varying locale
conventions (he didn't use the words), but had various decimal separators and
some people asked about non-'arabic' numbersΓ  (0-9).

This also goes back to some of the comments about file formats. Most file
formats are designed to be 'Machine Read' (even if they use text formatting)
and as such do NOT use localization facilities, so when processing them you
want the I/O processing system to be in a non-localized mode (typically numbers
 always use . as the decimal separator, and usually nothing as the thousands
separator). While the text format files might be opened in a text editor, the
file format doesn't cater to making things pretty for the user. Some programs
will create input/output/storage files where it is expected that the user WILL
open them, look at them and maybe even edit them. Numbers will use
the locale convention of currency and decimal/thousands separators. If you have
 such a system, changing the locale rules for these files may cause
misinterpreting the values.

If you are bringing such files from a 'foreign' system, you need to be able to
indicate what locale to use when reading that file. This sounds very much like
the category of problem that the OP was dealing with. They have apparently a
large number files, presumably organized in some consistent manner that the
values in them make sense, but the numbers are written in different local
conventions, and this was causing the simplistic processing to fail.

--
Richard Damon

--- BBBS/Li6 v4.10 Toy-3
 * Origin: Prism bbs (1:261/38)



More information about the Python-list mailing list