translating foreign data

Richard Damon Richard at Damon-Family.org
Fri Jun 22 06:48:42 EDT 2018


On 6/22/18 4:43 AM, Ethan Furman wrote:
> On 06/21/2018 01:20 PM, Ben Bacarisse wrote:
>
>> The code page remark is curious.  Will some "code pages" have digits
>> that are not ASCII digits?
>
> Good question.  I have no idea.  I get the appropriate decoder/encoder
> based on the code page contained in the file, then decode to unicode
> and go from there.  Unfortunately, that doesn't convert the decimal
> comma to the decimal point. :(  So I was hoping to map the code page
> to a locale that would properly translate the numbers for me, but so
> far what I have found in my readings suggests that in order to use the
> locale option I would have to actually change the active locale and
> potentially mess up every other part of the program when the file in
> question is opened in a locale that's different from its code page.
>
> Worst case scenario is I manually create a map for each code page to
> decimal separator, but there's more than a few and I'd rather not if
> there is already a prebuilt solution out there.
>
> -- 
> ~Ethan~
>
One problem is that code page does NOT uniquely define what decimal
separator to use (or what locale to use). You can get the decimal
separator issue even on files that are pure ASCII, and Latin-1 is full
of the issue too.

-- 
Richard Damon




More information about the Python-list mailing list