[Python-Dev] Round Bug in Python 1.6?

Ka-Ping Yee ping@lfw.org
Sun, 9 Apr 2000 03:46:41 -0700 (PDT)


In a previous message, i wrote:
> > It's very jarring to type something in, and have the interpreter
> > give you back something that looks very different.
[...]
> > It breaks a fundamental rule of consistency, and that damages the user's
> > trust in the system or their understanding of the system.

Then on Fri, 7 Apr 2000, Tim Peters replied:
> If they're surprised by this, they indeed don't understand the arithmetic at
> all!  This is an argument for using a different form of arithmetic, not for
> lying about reality.

This is not lying!  If you type in "3.1416" and Python says "3.1416",
then indeed it is the case that "3.1416" is a correct way to type in
the floating-point number being expressed.  So "3.1415999999999999"
is not any more truthful than "3.1416" -- it's just more annoying.

I just tried this in Python 1.5.2+:
    
    >>> .1
    0.10000000000000001
    >>> .2
    0.20000000000000001
    >>> .3
    0.29999999999999999
    >>> .4
    0.40000000000000002
    >>> .5
    0.5
    >>> .6
    0.59999999999999998
    >>> .7
    0.69999999999999996
    >>> .8
    0.80000000000000004
    >>> .9
    0.90000000000000002

Ouch.


I wrote:
> > (What do you do then, start explaining the IEEE double representation
> > to your CP4E beginner?)

Tim replied:
> As above.  repr() shouldn't be used at the interactive prompt anyway (but
> note that I did not say str() should be).

What, then?  Introduce a third conversion routine and further
complicate the issue?  I don't see why it's necessary.

I wrote:
> > What should really happen is that floats intelligently print in
> > the shortest and simplest manner possible

Tim replied:
> This can be done, but only if Python does all fp I/O conversions entirely on
> its own -- 754-conforming libc routines are inadequate for this purpose

Not "all fp I/O conversions", right?  Only repr(float) needs to
be implemented for this particular purpose.  Other conversions
like "%f" and "%g" can be left to libc, as they are now.

I suppose for convenience's sake it may be nice to add another
format spec so that one can ask for this behaviour from the "%"
operator as well, but that's a separate issue (perhaps "%r" to
insert the repr() of an argument of any type?).

> For background and code, track down "How To Print Floating-Point Numbers
> Accurately" by Steele & White, and its companion paper (s/Print/Read/)

Thanks!  I found 'em.  Will read...

I suggested:
> >     def smartrepr(x):
> >         p = 17
> >         while eval('%%.%df' % (p - 1) % x) == x: p = p - 1
> >         return '%%.%df' % p % x

Tim replied:
> This merely exposes accidents in the libc on the specific platform you run
> it.  That is, after
> 
>     print smartrepr(x)
> 
> on IEEE-754 platform A, reading that back in on IEEE-754 platform B may not
> yield the same number platform A started with.

That is not repr()'s job.  Once again:

    repr() is not for the machine.

It is not part of repr()'s contract to ensure the kind of
platform-independent conversion you're talking about.  It
prints out the number in a way that upholds the eval(repr(x)) == x
contract for the system you are currently interacting with, and
that's good enough.

If you wanted platform-independent serialization, you would
use something else.  As long as the language reference says

    "These represent machine-level double precision floating
    point numbers. You are at the mercy of the underlying
    machine architecture and C implementation for the accepted
    range and handling of overflow."

and until Python specifies the exact sizes and behaviours of
its floating-point numbers, you can't expect these kinds of
cross-platform guarantees anyway.


Here are the expectations i've come to have:

    str()'s contract:
      - if x is a string, str(x) == x
      - otherwise, str(x) is a reasonable string coercion from x

    repr()'s contract:
      - if repr(x) is syntactically valid, eval(repr(x)) == x
      - repr(x) displays x in a safe and readable way
      - for objects composed of basic types, repr(x) reflects
          what the user would have to say to produce x

    pickle's contract:
      - pickle.dumps(x) is a platform-independent serialization
        of the value and state of object x


-- ?!ng