[Numpy-discussion] formatting issues, locale and co

David Cournapeau cournape at gmail.com
Sun Dec 28 23:38:12 EST 2008


On Sun, Dec 28, 2008 at 4:12 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Dec 27, 2008 at 11:40 PM, David Cournapeau
> <david at ar.media.kyoto-u.ac.jp> wrote:
>>
>> Robert Kern wrote:
>> >
>> > We should not support locales. The string representations of these
>> > elements should be Python-parseable.
>> >
>>
>> It looks like I was wrong in my analysis of the problem: I thought I was
>> using the most recent implementation of PyOS_* functions in my test
>> codes, but the ones in 2.6 are not the same as the ones in the current
>> trunk. So the problem may be easier to fix that what I first thought:
>> simply providing our own PyOS_ascii_formatd (and similar for float and
>> long double) may be enough, and since we don't care about locale (%Z and
>> %n), the function is simple (and can be pulled out from python sources).
>>
>> We would then use PyOS_ascii_format* (locale independant) instead of
>> PyOS_snprintf (locale dependant) in str/repr implementation of scalar
>> arrays. Does that sound acceptable to you ?
>

I put my yesterday work in the fix_float_format branch:
 - it fixes the locale issue
 - it fixes the long double issue on windows.
 - it also fixes some tests (we were not testing single precision
formatting but twice double precision instead - the single precision
test fails on the trunk BTW).
 - it handles inf and nan more consistently across platforms (e.g.
str(np.log(0)) will be '-inf' on all platforms; on windows, it used to
be '-1.#INF' - I was afraid it would broke converting back the string
to float, but it is broken anyway before my change, e.g.
float('-1.#INF') does not work on windows).
 - for now, it breaks in windows python 2.5, because float(1e10) used
to be 1e+010 on python 2.5 and is 1e+10 on python 2.6 (to be more
consistent with C99). But I could simply forces a backward
compatibility with python 2.5/2.4, since I can control the number of
digits in the exponent in the formatting code.

There are still some problems related for double which I am not sure
how to solve:

import numpy as np
a = 1e10
print np.float32(a) # -> call format_float
print np.float64(a) # -> do not call format_double
print np.float96(a) # -> call format_longdouble

I guess the different with float64 comes from its multi-inheritence
(that is, it derives from the builtin float, and the rules for print
are different that for the other). Is this behavior the expected one ?

cheers,

David



More information about the NumPy-Discussion mailing list