[Numpy-discussion] formatting issues, locale and co

Sun Dec 28 01:58:58 EST 2008

On Sat, Dec 27, 2008 at 11:46 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Dec 28, 2008 at 01:38, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> > On Sat, Dec 27, 2008 at 10:27 PM, David Cournapeau
> > <david at ar.media.kyoto-u.ac.jp> wrote:
> >>
> >> Hi,
> >>
> >>    While looking at the last failures of numpy trunk on windows for
> >> python 2.5 and 2.6, I got into floating point number formatting issues;
> >> I got deeper and deeper, and now I am lost. We have several problems:
> >>    - we are not consistent between platforms, nor are we consistent
> >> with python
> >>    - str(np.float32(a)) is locale dependent, but python str method is
> >> not (locale.str is)
> >>    - formatting of long double does not work on windows because of the
> >> broken long double support in mingw.
> >>
> >> 1 consistency problem:
> >> ----------------------
> >>
> >> python -c "a = 1e20; print a" -> 1e+020
> >> python26 -c "a = 1e20; print a" -> 1e+20
> >>
> >> In numpy, we use PyOS_snprintf for formatting, but python itself uses
> >> PyOS_ascii_formatd - which has different behavior on different versions
> >> of python. The above behavior can be simply reproduced in C:
> >>
> >> #include <Python.h>
> >>
> >> int main()
> >> {
> >>    double x = 1e20;
> >>    char c[200];
> >>
> >>    PyOS_ascii_format(c, sizeof(c), "%.12g", x);
> >>    printf("%s\n", c);
> >>    printf("%g\n", x);
> >>
> >>    return 0;
> >> }
> >>
> >> On 2.5, this will print:
> >>
> >> 1e+020
> >> 1e+020
> >>
> >> But on 2.6, this will print:
> >>
> >> 1e+20
> >> 1e+020
> >>
> >> 2 locale dependency:
> >> --------------------
> >>
> >> Another issue is that our own formatting is local dependent, whereas
> >> python isn't:
> >>
> >> import numpy as np
> >> import locale
> >> locale.setlocale(locale.LC_NUMERIC, 'fr_FR')
> >> a = 1.2
> >>
> >> print "str(a)", str(a)
> >> print "locale.str(a)", locale.str(a)
> >> print "str(np.float32(a))", str(np.float32(a))
> >> print "locale.str(np.float32(a))", locale.str(np.float32(a))
> >>
> >> Returns:
> >>
> >> str(a) 1.2
> >> locale.str(a) 1,2
> >> str(np.float32(a)) 1,2
> >> locale.str(np.float32(a)) 1,20000004768
> >>
> >> I thought about copying the way python does the formatting in the trunk
> >> (where discrepancies between platforms have been fixed), but this is not
> >> so easy, because it uses a lot of code from different places - and the
> >> code needs to be adapted to float and long double. The other solution
> >> would be to do our own formatting, but this does not sound easy:
> >> formatting in C is hard. I am not sure about what we should do, if
> >> anyone else has any idea ?
> >
> > I think the first thing to do is make a decision on locale. If we chose
> to
> > support locales I don't see much choice but to depend Python because it's
> > too much work otherwise, and work not directly related to Numpy at that.
> If
> > we decide not to support locales then we can do our own formatting if we
> > need to using a fixed choice of locale. There is a list of snprintf
> > implementations here. Trio looks like a mature project and has an MIT
> > license, which I think is a license compatible with Numpy.
>
> We should not support locales. The string representations of these
> elements should be Python-parseable.
>
> > I'm inclined to just fix the locale and ignore the rest until Python gets
> > things sorted out. But I'm lazy...
>
> What do you think Python doesn't have sorted out?
>

Consistency between versions and platforms. David's note with the ticket
points to a Python 3.0 bug on this reported about, oh, two years ago. If we
wait long enough this problem will eventually get fixed as old python
versions disappear and some sort decision is made for the 3.x series. Or we
could do our own and be consistent with ourselves. There is also the problem
of long doubles on the windows platform, which isn't Python specific since
Python doesn't use long doubles. As I understand long doubles on windows,
mingw32 supports them, VS doesn't, so there is a compiler inconsistency to
deal with also.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20081227/814aa8a1/attachment.html>