Unexpected behaviour of math.floor, round and int functions (rounding)

Chris Angelico rosuav at gmail.com
Sat Nov 20 17:17:09 EST 2021


On Sun, Nov 21, 2021 at 8:32 AM Avi Gross via Python-list
<python-list at python.org> wrote:
>
> This discussion gets tiresome for some.
>
> Mathematics is a pristine world that is NOT the real world. It handles
> near-infinities fairly gracefully but many things in the real world break
> down because our reality is not infinitely divisible and some parts are
> neither contiguous nor fixed but in some sense wavy and probabilistic or
> worse.

But the purity of mathematics isn't the problem. The problem is
people's expectations around computers. (The problem is ALWAYS
people's expectations.)

> So in any computer, or computer language, we have realities to deal with
> when someone asks for say the square root of 2 or other transcendental
> numbers like pi or e or things like the sin(x) as often they are numbers
> which in decimal require an infinite number of digits and in many cases do
> not repeat. Something as simple as the fractions for 1/7, in decimal, has an
> interesting repeating pattern but is otherwise infinite.
>
> .142857142857142857 ... ->> 1/7
> .285714285714285714 ... ->> 2/7
> .428571 ...
> .571428 ...
> .714285 ...
> .857142 ...
>
> No matter how many bits you set aside, you cannot capture such numbers
> exactly IN BASE 10.

Right, and people understand this. Yet as soon as you switch from base
10 to base 2, it becomes impossible for people to understand that 1/5
now becomes the exact same thing: an infinitely repeating expansion
for the rational number.

> You may be able to capture some such things in another base but then yet
> others cannot be seen in various other bases. I suspect someone has
> considered a data type that stores results in arbitrary bases and delays
> evaluation as late as possible, but even those cannot handle many numbers.

More likely it would just store rationals as rationals - or, in other
words, fractions.Fraction().

> So the reality is that most computer programming is ultimately BINARY as in
> BASE 2. At some level almost anything is rounded and imprecise. About all we
> want to guarantee is that any rounding or truncation done is as consistent
> as possible so every time you ask for pi or the square root of 2, you get
> the same result stored as bits. BUT if you ask a slightly different
> question, why expect the same results? sqrt(2) operates on the number 2. But
> sqrt(6*(1/3)) first evaluates 1/3 and stores it as bits then multiplies it
> by the bit representation of 6 and stores a result which then is handed to
> sqrt() and if the bits are not identical, there is no guarantee that the
> result is identical.

This is what I take issue with. Binary doesn't mean "rounded and
imprecise". It means "base two". People get stroppy at a computer's
inability to represent 0.3 correctly, because they think that it
should be perfectly obvious what that value is. Nobody's bothered by
sqrt(2) not being precise, but they're very much bothered by 1/10 not
"working".

> Do note pure Mathematics is just as confusing at times. The number
> .99999999... where the dot-dot-dot notation means go on forever, is
> mathematically equivalent to the number 1 as is any infinite series that
> asymptotically approaches 1 as in
>
>         1/2 + 1/4 + 1/8 + ... + 1/(2**N) + ...
>
> It is not seen by many students how continually appending a 9 can ever be
> the same as a number like 1.00000 since every single digit is always not a
> match. But the mathematical theorems about limits are now well understood
> and in the limit as N approaches infinity, the two come to mean the same
> thing.

Mathematics is confusing. That's not a problem. To be quite frank, the
real world is far more confusing than the pristine beauty that we have
inside a computer. The problem isn't the difference between reality
and mathematics, or between reality and computers, or anything like
that; the problem, as always, is between people's expectations and
what computers do.

Tell me: if a is equal to b and b is equal to c, is a equal to c?
Mathematicians say "of course it is". Engineers say "there's no way
you can rely on that". Computer programmers side with whoever makes
most sense right this instant.

> So, what should be stressed, and often is, is to use tools available that
> let you compare numbers for being nearly equal.

No. No no no no no. You don't need to use a "nearly equal" comparison
just because floats are "inaccurate". It isn't like that. It's this
exact misinformation that I am trying to fight, because floats are NOT
inaccurate. They're just in binary, same as everything that computers
do.

> I note how unamused I was when making a small table in EXCEL (Note, not
> Python) of credit card numbers and balances when I saw the darn credit card
> numbers were too long and a number like:
>
> 4195032150199578
>
> was displayed by EXCEL as:
>
> 4195032150199570
>
> It looks like I just missed having significant stored digits and EXCEL
> reconstructed it by filling in a zero for the missing extra. The problem is
> I had to check balances sometimes and copy/paste generated the wrong number
> to use. I ended up storing the number as text using '4195032150199578 as I
> was not doing anything mathematical with it and this allowed me to keep all
> the digits as text strings can be quite long.
>
> But does this mean EXCEL is useless (albeit some thing so) or that the tool
> can only be used up to some extent and beyond that, can (silently) mislead
> you?

Oh, Excel is moronic in plenty of other ways.

https://www.youtube.com/watch?v=yb2zkxHDfUE

> Having said all that, this reminds me a bit about the Y2K issues where
> somehow nobody thought much about what happens when the year 2000 arrives
> and someone 103 years old becomes 3 again as only the final two digits of
> the year are stored. We now have the ability to make computers with
> increased speed and memory and so on and I wonder if anyone has tried to
> make a standard for say a 256-byte storage for multiple-precision floating
> point that holds lots more digits of precision as well as allowing truly
> huge exponents. Of course, it may not be practical to have computers that
> have registers and circuitry that can multiply two such numbers in a very
> few cycles, and it may be done in stages in thousands of cycles, so use of
> something big like that might not be a good default.
>

Yes, you could use 80-bit floats, 128-bit floats, or 256-bit floats,
but that won't change the fact that 0.3 can't be represented precisely
in binary, nor will it change the fact that 0.5 *can*. If people can't
think in binary, they won't think in binary with more bits either.

ChrisA


More information about the Python-list mailing list