[Python-bugs-list] [ python-Bugs-513866 ] Float/long comparison anomaly

Sun, 17 Feb 2002 06:43:59 -0800

Bugs item #513866, was opened at 2002-02-06 10:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513866&group_id=5470

Category: Python Interpreter Core
Group: None
Status: Open
Resolution: Later
Priority: 5
Submitted By: Andrew Koenig (arkoenig)
Assigned to: Nobody/Anonymous (nobody)
Summary: Float/long comparison anomaly

Initial Comment:
Comparing a float and a long appears to convert the 
long to float and then compare the two floats.  This 
strategy is a problem because the conversion might 
lose precision.  As a result, == is not an equivalence 
relation and < is not an order relation.  For example, 
it is possible to create three numbers a, b, and c 
such that a==b, b==c, and a!=c.

----------------------------------------------------------------------

Comment By: paul rubin (phr)
Date: 2002-02-17 06:43

Message:
Logged In: YES 
user_id=72053

I hope there's a simple solution to this--it's obvious what
the right result should be mathematically if you compare
1L<<10000 with 0.0.  It should not raise an error.  If the
documented behavior leads to raising an error, then there's
a bug in the document.  I agree that it's not the highest
priority bug in the world, but it doesn't seem that complicated.

If n is a long and x is a float, both >= 0, what happens if
you do this, to implement cmp(n,x):

   xl = long(x)
   # if x has a fraction part and int part is == n, then x>n
   if float(xl)!=x and xl==n: return 1
   return cmp(n, xl)

If both are < 0, change 1 to -1 above.  If x and n are of
opposite sign, the positive one is greater.

Unless I missed something (which is possible--I'm not too
alert right now) the above should be ok in all cases.

Basically you use long as the common type to convert to; you
do lose information when converting a non-integer, but for
the comparison with an integer, you don't need the lost
information other than knowing whether it was nonzero, which
you find out by converting the long back to a float.

----------------------------------------------------------------------

Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-09 07:42

Message:
Logged In: YES 
user_id=418174

I completely agree it's not a high-priority item, 
especially because it may be complicated to fix.

I think that the fundamental problem is that there is no 
common type to which both float and long can be converted 
without losing information, which complicates both the 
definition and implementation of comparison.  Accordingly, 
it might make sense to think about this issue in 
conjunction with future consideration of rational numbers.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-08 23:33

Message:
Logged In: YES 
user_id=31435

I reopened this, but unassigned it since I can't justify 
working on it (the benefit/cost ratio of fixing it is down 
in the noise compared to other things that should be done).

I no longer think we'd need a PEP to change the behavior, 
and agree it would be nice to change it.  Changing it may 
surprise people expecting Python to work like C (C99 says 
that when integral -> floating conversion is in range but 
can't be done exactly, either of the closest representable 
floating numbers may be returned; Python inherits the 
platform C's behavior here for Python int -> Python float 
conversion (C long -> C double); when the conversion is out 
of range, C doesn't define what happens, and Python 
inherits that too before 2.2 (Infinities and NaNs are what 
I've seen most often, varying by platform); in 2.2 it 
raises OverflowError).

I'm not sure it's possible for a<b and b<c and a==c, unless 
the platform C is inconsistent (meaning that (double)i for 
a fixed i returns the next-lowest double on some calls but 
the next-higher on others).  This brute-force searcher 
didn't turn up any examples on my box:

f = 2L**53 - 5 # straddle the representable-as-double limit

nums = [f+i for i in range(50)]
nums.extend(map(float, nums))

for a in nums:
.   for b in nums:
.       if not a < b:
.           continue
.       for c in nums:
.           if not b < c:
.               continue
.           if a >= c:
.               print `a`, `b`, `c`

----------------------------------------------------------------------

Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-07 07:33

Message:
Logged In: YES 
user_id=418174

Here is yet another surprise:

x=[1L<10000]
y=[0.0]
z=x+y

Now I can execute x.sort() and y.sort() successfully, but 
z.sort blows up.

----------------------------------------------------------------------

Comment By: Andrew Koenig (arkoenig)
Date: 2002-02-07 05:28

Message:
Logged In: YES 
user_id=418174

The difficulty is that as defined, < is not an order 
relation, because there exist values a, b, c such that a<b, 
b==c, and a==c.  I believe that there also exist values 
such that a<b, b<c, and a==c.  Under such circumstances, it 
is hard to understand how sort can work properly, whicn is 
my real concern.  Do you really want to warn people that 
they shouldn't sort lists containing floats and longs?

Moreover, it is not terribly difficult to define the 
comparisons so that == is an equivalence relation and < is 
an order relation.  The idea is that for any floating-point 
system, there is a threshold T such that if x is a float 
value >=T, converting x to long will not lose information, 
and if x is a long value <=T, converting x to float will 
not lose information.  Therefore, instead of always 
converting to long, it suffices to convert in a direction 
chosen by comparing the operands to T (without conversion) 
first.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-06 20:59

Message:
Logged In: YES 
user_id=31435

Oops!  I meant

"""
could lead to a different result than the explicit coercion 
in

somefloat == float(somelong)
"""

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-02-06 20:52

Message:
Logged In: YES 
user_id=31435

Since the coercion to float is documented and intended, 
it's not "a bug" (it's functioning as designed), although 
you may wish to argue for a different design, in which case 
making an incompatible change would first require a PEP and 
community debate.  Information loss in operations involving 
floats comes with the territory, and I don't see a reason 
to single this particular case out as especially 
surprising.  OTOH, I expect it would be especially 
surprising to a majority of users if the implicit coercion 
in

somefloat == somelong

could lead to a different result than the explicit coercion 
in

long(somefloat) == somelong

Note that the "long" type isn't unique here:  the same is 
true of mixing Python ints with Python floats on boxes 
where C longs have more bits of precision than C doubles 
(e.g., Linux for IA64, and Crays).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=513866&group_id=5470