[Python-bugs-list] [ python-Bugs-513866 ] Float/long comparison anomaly
noreply@sourceforge.net
noreply@sourceforge.net
Sun, 17 Feb 2002 06:43:59 -0800
Bugs item #513866, was opened at 2002-02-06 10:33
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513866&group_id=5470
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: Later
Priority: 5
Submitted By: Andrew Koenig (arkoenig)
Assigned to: Nobody/Anonymous (nobody)
Summary: Float/long comparison anomaly
Initial Comment:
Comparing a float and a long appears to convert the
long to float and then compare the two floats. This
strategy is a problem because the conversion might
lose precision. As a result, == is not an equivalence
relation and < is not an order relation. For example,
it is possible to create three numbers a, b, and c
such that a==b, b==c, and a!=c.
----------------------------------------------------------------------
Comment By: paul rubin (phr)
Date: 2002-02-17 06:43
Message:
Logged In: YES
user_id=72053
I hope there's a simple solution to this--it's obvious what
the right result should be mathematically if you compare
1L<<10000 with 0.0. It should not raise an error. If the
documented behavior leads to raising an error, then there's
a bug in the document. I agree that it's not the highest
priority bug in the world, but it doesn't seem that complicated.
If n is a long and x is a float, both >= 0, what happens if
you do this, to implement cmp(n,x):
xl = long(x)
# if x has a fraction part and int part is == n, then x>n
if float(xl)!=x and xl==n: return 1
return cmp(n, xl)
If both are < 0, change 1 to -1 above. If x and n are of
opposite sign, the positive one is greater.
Unless I missed something (which is possible--I'm not too
alert right now) the above should be ok in all cases.
Basically you use long as the common type to convert to; you
do lose information when converting a non-integer, but for
the comparison with an integer, you don't need the lost
information other than knowing whether it was nonzero, which
you find out by converting the long back to a float.
----------------------------------------------------------------------
Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-09 07:42
Message:
Logged In: YES
user_id=418174
I completely agree it's not a high-priority item,
especially because it may be complicated to fix.
I think that the fundamental problem is that there is no
common type to which both float and long can be converted
without losing information, which complicates both the
definition and implementation of comparison. Accordingly,
it might make sense to think about this issue in
conjunction with future consideration of rational numbers.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-02-08 23:33
Message:
Logged In: YES
user_id=31435
I reopened this, but unassigned it since I can't justify
working on it (the benefit/cost ratio of fixing it is down
in the noise compared to other things that should be done).
I no longer think we'd need a PEP to change the behavior,
and agree it would be nice to change it. Changing it may
surprise people expecting Python to work like C (C99 says
that when integral -> floating conversion is in range but
can't be done exactly, either of the closest representable
floating numbers may be returned; Python inherits the
platform C's behavior here for Python int -> Python float
conversion (C long -> C double); when the conversion is out
of range, C doesn't define what happens, and Python
inherits that too before 2.2 (Infinities and NaNs are what
I've seen most often, varying by platform); in 2.2 it
raises OverflowError).
I'm not sure it's possible for a<b and b<c and a==c, unless
the platform C is inconsistent (meaning that (double)i for
a fixed i returns the next-lowest double on some calls but
the next-higher on others). This brute-force searcher
didn't turn up any examples on my box:
f = 2L**53 - 5 # straddle the representable-as-double limit
nums = [f+i for i in range(50)]
nums.extend(map(float, nums))
for a in nums:
. for b in nums:
. if not a < b:
. continue
. for c in nums:
. if not b < c:
. continue
. if a >= c:
. print `a`, `b`, `c`
----------------------------------------------------------------------
Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-07 07:33
Message:
Logged In: YES
user_id=418174
Here is yet another surprise:
x=[1L<10000]
y=[0.0]
z=x+y
Now I can execute x.sort() and y.sort() successfully, but
z.sort blows up.
----------------------------------------------------------------------
Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-07 05:28
Message:
Logged In: YES
user_id=418174
The difficulty is that as defined, < is not an order
relation, because there exist values a, b, c such that a<b,
b==c, and a==c. I believe that there also exist values
such that a<b, b<c, and a==c. Under such circumstances, it
is hard to understand how sort can work properly, whicn is
my real concern. Do you really want to warn people that
they shouldn't sort lists containing floats and longs?
Moreover, it is not terribly difficult to define the
comparisons so that == is an equivalence relation and < is
an order relation. The idea is that for any floating-point
system, there is a threshold T such that if x is a float
value >=T, converting x to long will not lose information,
and if x is a long value <=T, converting x to float will
not lose information. Therefore, instead of always
converting to long, it suffices to convert in a direction
chosen by comparing the operands to T (without conversion)
first.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-02-06 20:59
Message:
Logged In: YES
user_id=31435
Oops! I meant
"""
could lead to a different result than the explicit coercion
in
somefloat == float(somelong)
"""
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2002-02-06 20:52
Message:
Logged In: YES
user_id=31435
Since the coercion to float is documented and intended,
it's not "a bug" (it's functioning as designed), although
you may wish to argue for a different design, in which case
making an incompatible change would first require a PEP and
community debate. Information loss in operations involving
floats comes with the territory, and I don't see a reason
to single this particular case out as especially
surprising. OTOH, I expect it would be especially
surprising to a majority of users if the implicit coercion
in
somefloat == somelong
could lead to a different result than the explicit coercion
in
long(somefloat) == somelong
Note that the "long" type isn't unique here: the same is
true of mixing Python ints with Python floats on boxes
where C longs have more bits of precision than C doubles
(e.g., Linux for IA64, and Crays).
----------------------------------------------------------------------
You can respond by visiting:
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513866&group_id=5470