How safe is a set of floats?

Fri May 4 10:50:21 EDT 2007

Thomas Nelson <thn at mail.utexas.edu> wrote:

> I want to generate all the fractions between 1 and limit (with
> limit>1) in an orderly fashion, without duplicates.
> 
> def all_ratios(limit):
>     s = set()
>     hi = 1.0
>     lo = 1.0
>     while True:
>         if hi/lo not in s:
>             s.add(hi/lo)
>             yield (hi,lo)
>         hi += 1
>         if hi/lo > limit:
>             lo += 1
>             hi = lo
> 
> I use a set to keep from giving duplicates; but is this safe?  In C
> they always tell you not to trust floating point equality comparisons,
> since they may not work as you expect.  My code seems fine for the
> limited amount I've tested, but I'm curious: is there a gaurantee
> about sets of floats?  Or a warning?

sets of floats work exactly like sets of anything else and thus in
particular they DO intrinsically rely on == comparisons, i.e., exact
equality checks (just like dicts whose keys are floats, etc).

In your code, some "fractions" that actually differ from others you're
previously seen will in fact be skipped because they don't differ _by
enough_ -- i.e. they do compare == to within the limited precision of
floating-point computations.  But if you do want to be yielding floats,
and never want to yield the (num, denom) tuples for two items that *as
float* compare ==, there's nothing you can do about that issue.

My main suggestion to you actually would be to compute hi/lo ONCE per
iteration rather than 3 times -- I detest repetition in principle and
here it may be costing you a few nanoseconds' speed:-)

[[If you don't truly care about whether the fractions you yield do
compare as == "as floats", you might e.g. use gmpy.mpq rather than
division to perform your checks]]

Alex