[Numpy-discussion] formatting issues, locale and co

Mon Dec 29 00:36:40 EST 2008

On Sun, Dec 28, 2008 at 9:38 PM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Dec 28, 2008 at 4:12 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Sat, Dec 27, 2008 at 11:40 PM, David Cournapeau
> > <david at ar.media.kyoto-u.ac.jp> wrote:
> >>
> >> Robert Kern wrote:
> >> >
> >> > We should not support locales. The string representations of these
> >> > elements should be Python-parseable.
> >> >
> >>
> >> It looks like I was wrong in my analysis of the problem: I thought I was
> >> using the most recent implementation of PyOS_* functions in my test
> >> codes, but the ones in 2.6 are not the same as the ones in the current
> >> trunk. So the problem may be easier to fix that what I first thought:
> >> simply providing our own PyOS_ascii_formatd (and similar for float and
> >> long double) may be enough, and since we don't care about locale (%Z and
> >> %n), the function is simple (and can be pulled out from python sources).
> >>
> >> We would then use PyOS_ascii_format* (locale independant) instead of
> >> PyOS_snprintf (locale dependant) in str/repr implementation of scalar
> >> arrays. Does that sound acceptable to you ?
> >
>
> I put my yesterday work in the fix_float_format branch:
>  - it fixes the locale issue
>  - it fixes the long double issue on windows.
>  - it also fixes some tests (we were not testing single precision
> formatting but twice double precision instead - the single precision
> test fails on the trunk BTW).

Curious, I don't see any test failures here. Were the tests actually being
run or is something else different in your test setup? Or do you mean the
fixed up test fails.

>
>  - it handles inf and nan more consistently across platforms (e.g.
> str(np.log(0)) will be '-inf' on all platforms; on windows, it used to
> be '-1.#INF' - I was afraid it would broke converting back the string
> to float, but it is broken anyway before my change, e.g.
> float('-1.#INF') does not work on windows).
>  - for now, it breaks in windows python 2.5, because float(1e10) used
> to be 1e+010 on python 2.5 and is 1e+10 on python 2.6 (to be more
> consistent with C99). But I could simply forces a backward
> compatibility with python 2.5/2.4, since I can control the number of
> digits in the exponent in the formatting code.
>
> There are still some problems related for double which I am not sure
> how to solve:
>
> import numpy as np
> a = 1e10
> print np.float32(a) # -> call format_float
> print np.float64(a) # -> do not call format_double
> print np.float96(a) # -> call format_longdouble
>
> I guess the different with float64 comes from its multi-inheritence
> (that is, it derives from the builtin float, and the rules for print
> are different that for the other). Is this behavior the expected one ?
>

Expected, but I would like to see it change because it is kind of
frustrating. Fixing it probably involves setting a function pointer in the
type definition but I am not sure about that. We might also want to do
something about integers, as in Python 3.0 they will all be Python long
integers. I don't know if that actually breaks anything in numpy, or how
Python 3.0 implements integers, but it might be a good idea not to derive
from Python integers. How that will affect indexing speed I don't know.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20081228/29bcc037/attachment.html>