Comparing float and decimal

Mark Dickinson dickinsm at gmail.com
Wed Oct 1 05:21:50 EDT 2008


On Sep 30, 8:07 pm, Terry Reedy <tjre... at udel.edu> wrote:
> Documenting the problem properly would mean changing the set
> documentation to change at least the definitions of union (|), issubset
> (<=), issuperset (>=), and symmetric_difference (^) from their current
> math set based definitions to implementation based definitions that
> describe what they actually do instead of what they intend to do.  I do
> not like this option.

I was thinking more of a single-line warning in the set documentation
to the effect that funny things happen in the absence of transitivity
of equality, perhaps pointing the finger at Decimal as the most
obvious troublemaker;  the Decimal documentation could elaborate on
this.
That is, rather than documenting exactly what the set operations do,
document what they're supposed to do (just as now) and declare that
behaviour is undefined for sets of elements for which transitivity
fails.

> (1A) All that is needed for fix equality transitivity corruption and the
> consequent set/dictview problems is to correctly compare integral
> values.  For this, Decimal hash seems fine already.  For the int i I
> tried, hash(i) == hash(float(i)) == hash(Decimal(i)) ==
> hash(Fraction(i)) == i.

Good point.  Though I'd be a bit uncomfortable with having
Decimal(1) == 1.0 return True, but Decimal('0.5') == 0.5 return False.
Not sure what the source of my discomfort is;  partly I think it's
that I want to be able to explain the comparison rules at the
level of types;  having some floats behave one way and some behave
another feels odd.  And explaining to confused users exactly
why Decimal behaves this way could be fun.

I think I'd prefer option 1 to option 1a.

> (3) Further isolate decimals by making decimals also unequal to all
> ints.  Like (1A), this would easily fix transitivity breakage, but I
> would consider the result less desirable.

I'd oppose this.  I think having decimals play nicely with integers
is important, both practically and theoretically.  There's probably
also already existing code that depends on comparisons between
integers and Decimals working as expected.

So I guess my ranking is 0 > 1 > 1a > 3, though I could live
with any of 0, 1, or 1a.

It's really the decimal module that's breaking the rules here;
I feel it's the decimal module's responsibility to either
fix or document the resulting problems.

It would also be nice if it were made more obvious somewhere
in the docs that transitivity of equality is important
for correct set and dict behaviour.

Mark



More information about the Python-list mailing list