[Python-Dev] Numerical robustness, IEEE etc.

Kevin Jacobs <jacobs@bioinformed.com> bioinformed at gmail.com
Sat Jun 24 17:50:36 CEST 2006


On 6/23/06, Nick Maclaren <nmm1 at cus.cam.ac.uk> wrote:
>
> jacobs at bioinformed.com wrote:
> >
> > > >Unfortunately, that doesn't help, because it is not where the issues
> > > >are.  What I don't know is how much you know about numerical models,
> > > >IEEE 754 in particular, and C99.  You weren't active on the SC22WG14
> > > >reflector, but there were some lurkers.
> >
> > Hand wave, hand wave, hand wave.  [...]
>
> SC22WG14 is the ISO committee that handles C standardisation.  [...]


I'm not asking you to describe SC22WG14 or post detailed technical summaries
of the long and painful road.  I'd like you to post things directly relevant
to Python with footnotes to necessary references.  It is then incumbent on
those that wish to respond to your post to familiarize themselves with the
relevant background material.  However, it is really darn hard to do that
when we don't know what you're trying to fix in Python.  The examples you
show below are a good start in that direction.

> A good place to start: You mentioned earlier that there where some
> > nonsensical things in floatobject.c.  Can you list some of the most
> serious
> > of these?
>
> Well, try the following for a start:
>
> Python 2.4.2 (#1, May  2 2006, 08:28:01)
> [GCC 4.1.0 (SUSE Linux)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> a = "NaN"
> >>> b = float(a)
> >>> c = int(b)
> >>> d = (b == b)
> >>> print a, b, c, d
> NaN nan 0 False


Python 2.3.3 (#1, Feb 18 2004, 11:58:04)
> [GCC 2.8.1] on sunos5
> Type "help", "copyright", "credits" or "license" for more information.
> >>> a = "NaN"
> >>> b = float(a)
> >>> c = int(b)
> >>> d = (b == b)
> >>> print a, b, c, d
> NaN NaN 0 True
>
> That demonstrates that the error state is lost by converting to int,
> and that NaN testing isn't reliable.
>


Now we're getting to business.  There are actually (at least 3 issues) that
I see:

1) The string representation of NaN is not standardized across platforms
2) on a sane platform, int(float('NaN')) should raise an ValueError
exception for the int() portion.
3) float('NaN') == float('NaN') should be false, assuming NaN is not a
signaling NaN, by default

If we include Windows:

Python 2.5b1 (r25b1:47027, Jun 20 2006, 09:31:33) [MSC v.1310 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> a = "NaN"
>>> b = float(a)

Traceback (most recent call last):
  File "<pyshell#1>", line 1, in <module>
    b = float(a)
ValueError: invalid literal for float(): NaN
>>>

So:
  4) in addition to #1, the platform atof sometimes doesn't accept any
conventional spelling of NaN
  5) All of the above likely applies to infinities and +-0

So the open question is how to both define the semantics of Python floating
point operations and to implement them in a way that verifiably works on the
vast majority of platforms without turning the code into a maze of
platform-specific defines, kludges, or maintenance problems waiting to
happen.

Thanks,
-Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-dev/attachments/20060624/7f4efce8/attachment.htm 


More information about the Python-Dev mailing list