[Python-Dev] Round Bug in Python 1.6?

Tim Peters tim_one@email.msn.com
Sun, 9 Apr 2000 15:42:11 -0400


[Christian Tismer]
> ...
> Here is the problem, as I see it:
> You say if you type 3.1416, you want to get exactly this back.
>
> But how should Python know that you typed it in?
> Same in my case: I just rounded to 3 digits, but how
> should Python know about this?
>
> And what do you expect when you type in 3.14160, do you want
> the trailing zero preserved or not?
>
> Maybe we would need to carry exactness around for numbers.
> Or even have a different float type for cases where we want
> exact numbers? Keyboard entry and rounding produce exact numbers.
> Simple operations between exact numbers would keep exactness,
> higher level functions would probably not.
>
> I think we dlved into a very difficult domain here.

"This kind of thing" is hopeless so long as Python uses binary floating
point.  Ping latched on to "shortest" conversion because it appeared to
solve "the problem" in a specific case.  But it doesn't really solve
anything -- it just shuffles the surprises around.  For example,

>>> 3.1416 - 3.141
0.00059999999999993392
>>>

Do "shorest conversion" (relative to the universe of IEEE doubles) instead,
and it would print

0.0005999999999999339

Neither bears much syntactic resemblance to the

0.0006

the numerically naive "expect".  Do anything less than the 16 significant
digits shortest conversion happens to produce in this case, and eval'ing the
string won't return the number you started with.  So "0.0005999999999999339"
is the "best possible" string repr can produce (assuming you think "best" ==
"shortest faithful, relative to the platform's universe of possibilities",
which is itself highly debatable).

If you don't want to see that at the interactive prompt, one of two things
has to change:

A) Give up on eval(repr(x)) == x for float x, even on a single machine.

or

B) Stop using repr by default.

There is *no* advantage to #A over the long haul:  lying always extracts a
price, and unlike most of you <wink>, I appeared to be the lucky email
recipient of the passionate gripes about repr(float)'s inadequacy in 1.5.2
and before.  Giving a newbie an illusion of comfort at the cost of making it
useless for experts is simply nuts.

The desire for #B pops up from multiple sources:  people trying to use
native non-ASCII chars in strings; people just trying to display docstrings
without embedded "\012" (newline) and "\011" (tab) escapes; and people using
"big" types (like NumPy arrays or rationals) where repr() can produce
unboundedly more info than the interactive user typically wants to see.

It *so happens* that str() already "does the right thing" in all 3 of the
last three points, and also happens to produce "0.0006" for the example
above.  This is why people leap to:

C) Use str by default instead of repr.

But str doesn't pass down to containees, and *partly* does a wrong thing
when applied to strings, so it's not suitable either.  It's *more* suitable
than repr, though!

trade-off-ing-ly y'rs   - tim