[Python-Dev] Re: Be Honest about LC_NUMERIC [REPOST]

James Henstridge james at daa.com.au
Tue Aug 19 23:04:49 EDT 2003


On 14/08/2003 10:21 AM, Christian Reis wrote:

>So, in an attempt to garner comments (now that we have 2.3 off the
>chopping block) I'm reposting my PEP proposal (with minor updates).
>Comments would be appreciated, of course (nudges Barry slightly after
>him getting me to write this on my only free Sunday in months ;)
>

In my opinion, I think it is worth investigating this.

One of the things I love about Python is the way how it can be used to 
glue different bits of code together to form useful programs.  Python's 
handling of LC_NUMERIC seems to work against this goal.

According to the POSIX standard, standard library functions like 
strtod() and printf()/sprintf() work with locale representations of 
floating point numbers once the setlocale() function has been called.  
For locale aware libraries, it seems quite sensible for them to use 
standard library functions to format dates for display, and reading 
dates entered by the user (which may use a comma for the decimal point 
under some locales).

Now since Python uses the C standard library functions to parse floating 
point numbers too, and doesn't want these string <-> float conversions 
to be locale sensitive.

The solution currently in place is to declare that LC_NUMERIC must be C 
for Python programs (or applications embedding Python) or things will go 
wrong.  If a bit of Python code wishes to format a float according to 
the locale it can call the locale.format() and to read a float from the 
user, can use locale.atof().  These use locale data read while the 
locale was set to the correct value temporarily.

Unfortunately, this does not help external libraries that know nothing 
of Python's requirements about how locales are set up, so often they 
will always represent and read floating point numbers under the C locale 
(using a full stop as the decimal point).  This is the problem Christian 
ran into with the GtkSpinButton widget in GTK, and I would not be 
surprised if other people have run into the problem as well.

There are two solutions to this problem that I can see:

   1. modify Python so that it doesn't use locale sensitive conversion
      functions when it wants to convert floats in a locale independent
      manner.
   2. modify every external library that makes use of the standard
      library strtod()/sprintf(), expecting locale sensitive float
      conversions, to use some other API.

Clearly (1) is the easier option, as there is a finite (and quite small) 
amount of code to change.  For (2), there is potentially an unlimited 
amount of code to change.

As Christian said, there is code in glib (not to be confused with glibc: 
the GNU C library) that could act as a basis for locale independent 
float conversion functions in Python.  The code was written by Alex 
Larsson (who works at Red Hat, so I suppose they own the copyright), who 
is willing to license it under Python's terms.

You can see the history of the two functions (g_ascii_strtod() and 
g_ascii_formatd()) here:
    http://cvs.gnome.org/bonsai/cvsblame.cgi?file=glib/glib/gstrfuncs.c#328

There are very minor alterations by other people (they look minor enough 
that the FSF wouldn't require a copyright assignment), but you could 
always use the versions from the initial checkin (rev 1.77) if that is a 
problem.

One of the alternatives that some programs use to do locale independent 
conversions using code a bit like this:
    char *oldlocale = setlocale(LC_NUMERIC, "C");
    num = strtod(string, NULL);
    setlocale(LC_NUMERIC, oldlocale);

This particular code snippet has some problems that I think make it 
unsuitable for Python:

    * setlocale() affects the whole process, so this sort of operation
      could affect the results of strtod, printf, etc for some other
      thread in the program.
    * setlocale() is not reentrant.  Oldlocale in the above snippet is a
      pointer to a static string.  If you surround the  snippet with
      another pair of setlocale() calls, you can get unexpected
      results.  This means that adding setlocale() calls to random
      Python API calls has the potential to break existing code.

Alex's code from glib does not suffer from these problems.


To sum it up, the current status-quo in Python w.r.t. locales is causing 
problems for some problems people want to use Python for.  It would be 
nice to fix this problem.

James.

-- 
Email: james at daa.com.au
WWW:   http://www.daa.com.au/~james/





More information about the Python-Dev mailing list