[Python-Dev] Re: Be Honest about LC_NUMERIC [REPOST]
James Henstridge
james at daa.com.au
Tue Aug 19 23:04:49 EDT 2003
On 14/08/2003 10:21 AM, Christian Reis wrote:
>So, in an attempt to garner comments (now that we have 2.3 off the
>chopping block) I'm reposting my PEP proposal (with minor updates).
>Comments would be appreciated, of course (nudges Barry slightly after
>him getting me to write this on my only free Sunday in months ;)
>
In my opinion, I think it is worth investigating this.
One of the things I love about Python is the way how it can be used to
glue different bits of code together to form useful programs. Python's
handling of LC_NUMERIC seems to work against this goal.
According to the POSIX standard, standard library functions like
strtod() and printf()/sprintf() work with locale representations of
floating point numbers once the setlocale() function has been called.
For locale aware libraries, it seems quite sensible for them to use
standard library functions to format dates for display, and reading
dates entered by the user (which may use a comma for the decimal point
under some locales).
Now since Python uses the C standard library functions to parse floating
point numbers too, and doesn't want these string <-> float conversions
to be locale sensitive.
The solution currently in place is to declare that LC_NUMERIC must be C
for Python programs (or applications embedding Python) or things will go
wrong. If a bit of Python code wishes to format a float according to
the locale it can call the locale.format() and to read a float from the
user, can use locale.atof(). These use locale data read while the
locale was set to the correct value temporarily.
Unfortunately, this does not help external libraries that know nothing
of Python's requirements about how locales are set up, so often they
will always represent and read floating point numbers under the C locale
(using a full stop as the decimal point). This is the problem Christian
ran into with the GtkSpinButton widget in GTK, and I would not be
surprised if other people have run into the problem as well.
There are two solutions to this problem that I can see:
1. modify Python so that it doesn't use locale sensitive conversion
functions when it wants to convert floats in a locale independent
manner.
2. modify every external library that makes use of the standard
library strtod()/sprintf(), expecting locale sensitive float
conversions, to use some other API.
Clearly (1) is the easier option, as there is a finite (and quite small)
amount of code to change. For (2), there is potentially an unlimited
amount of code to change.
As Christian said, there is code in glib (not to be confused with glibc:
the GNU C library) that could act as a basis for locale independent
float conversion functions in Python. The code was written by Alex
Larsson (who works at Red Hat, so I suppose they own the copyright), who
is willing to license it under Python's terms.
You can see the history of the two functions (g_ascii_strtod() and
g_ascii_formatd()) here:
http://cvs.gnome.org/bonsai/cvsblame.cgi?file=glib/glib/gstrfuncs.c#328
There are very minor alterations by other people (they look minor enough
that the FSF wouldn't require a copyright assignment), but you could
always use the versions from the initial checkin (rev 1.77) if that is a
problem.
One of the alternatives that some programs use to do locale independent
conversions using code a bit like this:
char *oldlocale = setlocale(LC_NUMERIC, "C");
num = strtod(string, NULL);
setlocale(LC_NUMERIC, oldlocale);
This particular code snippet has some problems that I think make it
unsuitable for Python:
* setlocale() affects the whole process, so this sort of operation
could affect the results of strtod, printf, etc for some other
thread in the program.
* setlocale() is not reentrant. Oldlocale in the above snippet is a
pointer to a static string. If you surround the snippet with
another pair of setlocale() calls, you can get unexpected
results. This means that adding setlocale() calls to random
Python API calls has the potential to break existing code.
Alex's code from glib does not suffer from these problems.
To sum it up, the current status-quo in Python w.r.t. locales is causing
problems for some problems people want to use Python for. It would be
nice to fix this problem.
James.
--
Email: james at daa.com.au
WWW: http://www.daa.com.au/~james/
More information about the Python-Dev
mailing list