[Numpy-discussion] formatting issues, locale and co
David Cournapeau
david at ar.media.kyoto-u.ac.jp
Sun Dec 28 00:27:07 EST 2008
Hi,
While looking at the last failures of numpy trunk on windows for
python 2.5 and 2.6, I got into floating point number formatting issues;
I got deeper and deeper, and now I am lost. We have several problems:
- we are not consistent between platforms, nor are we consistent
with python
- str(np.float32(a)) is locale dependent, but python str method is
not (locale.str is)
- formatting of long double does not work on windows because of the
broken long double support in mingw.
1 consistency problem:
----------------------
python -c "a = 1e20; print a" -> 1e+020
python26 -c "a = 1e20; print a" -> 1e+20
In numpy, we use PyOS_snprintf for formatting, but python itself uses
PyOS_ascii_formatd - which has different behavior on different versions
of python. The above behavior can be simply reproduced in C:
#include <Python.h>
int main()
{
double x = 1e20;
char c[200];
PyOS_ascii_format(c, sizeof(c), "%.12g", x);
printf("%s\n", c);
printf("%g\n", x);
return 0;
}
On 2.5, this will print:
1e+020
1e+020
But on 2.6, this will print:
1e+20
1e+020
2 locale dependency:
--------------------
Another issue is that our own formatting is local dependent, whereas
python isn't:
import numpy as np
import locale
locale.setlocale(locale.LC_NUMERIC, 'fr_FR')
a = 1.2
print "str(a)", str(a)
print "locale.str(a)", locale.str(a)
print "str(np.float32(a))", str(np.float32(a))
print "locale.str(np.float32(a))", locale.str(np.float32(a))
Returns:
str(a) 1.2
locale.str(a) 1,2
str(np.float32(a)) 1,2
locale.str(np.float32(a)) 1,20000004768
I thought about copying the way python does the formatting in the trunk
(where discrepancies between platforms have been fixed), but this is not
so easy, because it uses a lot of code from different places - and the
code needs to be adapted to float and long double. The other solution
would be to do our own formatting, but this does not sound easy:
formatting in C is hard. I am not sure about what we should do, if
anyone else has any idea ?
cheers,
David
More information about the NumPy-Discussion
mailing list