Future floating point directions? [was Re: floating point in 2.0]

Sun Jun 10 21:48:35 EDT 2001

And Tim Peters writes:
 - 
 - Python has no 754 story -- it doesn't try.

Which is really astonishing for a language aiming at ease of use
and teaching.  Yes, I know, it's someone else's fault.  (Not that
I've been jumping up and down with code. sigh...  C++ sucks my
will to code, but I don't know any more flexible language that
compiles on both a T3E and an SP.)

 - (and sorry to say I'm not sure it ever will be -- there too it's an 
 - optional thing).

I can think of three companies with test vectors being developed, 
so I assume there's something to test.  Of course, all three also
sell hardware with these features.  gcc's coming along, but with
much more finely stretched resources.

 - It certainly wasn't.  Fusing the mul and add HW wasn't even suggested by
 - 754, it was something they did to reduce chip real estate, [...]

Nope.  They happened to have a chunk of space left, someone
suggested fma for correctly rounding an iterative division, and 
an engineer did it over a weekend, fitting it into the add, mul,
and extra space.  Can't remember his name, but I can dig through 
my notebook if necessary.  Story from Dr. Kahan, who was there.
Go fig.

 - and because they had clear ways in mind to speed math libraries [...]

Actually, it was for getting a division without a divider.  But
your point's mostly valid.  The primary push was on getting some
extra speed (software-ish, thus pipelined, divide), but the group 
also wanted to compete on reliability features.  Thus the extra 
breakdown of the invalid flag.

Most of the features we want to add will address reliability /
debugability in some way.  Things like trapping methods that
don't require precise interrupts, etc.

 - Sun is an exception, and I'd guess more due to David Hough's 
 - influence and persistence than to anything else -- 

Yes, but he's not the only one at Sun.  They're mostly ex-students 
of Dr. Kahan's, though.

And I could have drawn an example from HP, and that wouldn't be
because of a Kahan student.  Surprising, but true.  ;)

 - Virtually all HW FPUs support them, and because they're required.

And because rounding modes are nigh impossible to implement in
software.  I do wish software had access to the three extra bits
necessary to round correctly.  Compiling without double-rounding 
on x86 is pretty difficult without them.

 - Without that C99 wouldn't have even considered it (part of X3J11's 
 - mandate is not to require things that can't be done w/ reasonable 
 - efficiency, so w/o near-universal HW support already in place, they 
 - would have punted on this too).

Um, fma is required in C99.  It certainly lacks near-universal 
hardware support...  And portability pressures have a way of
making optional features required.  Look at how many vendors 
have Linux syscall translation layers.

 - Tossing "recommended" out the window is the best tool a std has.

Considering the groups being targetted have completely ignored
REQUIRED pieces, I don't know if it matters.

 - You said you're introducing a quad format, and I assume that's 
 - "the usual" 15-bit exponent + 113-bit mantissa one, so stop at 
 - required correct rounding to and from no more than 36 decimal
 - digits, and everyone will be happy (this can be done with code 
 - much simpler than Gay's, and also without bigint arithmetic).

Well, if the only defined destinations are single, double, and
quad, then you only have to support conversions to those 
destinations.  The standard only defines things in the standard.

 - > Binary to decimal is constrained so that binary -> decimal -> binary
 - > produces the same number if both binary numbers are in the same
 - > precision.
 - 
 - Users want that, but IIRC 754 already requires it.

Not quite.  It's required for a smaller range than is now 
possible.

 - WRT single default, it doesn't have enough precision to avoid gross 
 - surprises -- newbies aren't forecasting the weather with one-digit
 - inputs, they're calculating their retirement benefits based on 20% 
 - compound growth for 50 years <wink>.

Ok.  Calculate in double precision with single precision inputs 
and output.  You'll get _much_ more reliable answers.  But 
unfortunately, Python doesn't make that easy.  A calculator with 
that feature (the HP 12C) does use extra precision in the 
calculation.

 - Single is a storage-efficiency gimmick for big problems with
 - low-precision data.

Nonsense.  Single precision is for limiting the effects of 
rounding.  Single precision without double precision is 
incredibly silly (although some DSPs seem to survive).  Single-
precision defaults require a wider evaluation type.

 - "Number-like types" include things like Dates and Times in Python; 
 - there is no widest numeric type, and the set of numeric types isn't 
 - even known at compile-time (users can add new ones on the fly).  

At the very worst, you can find the widest common type of all
floating-point numbers in a given statement.  Consider a
statement containing arithmetic-looking operations.  The 
interpreter can evaluate all the methods appearing in the
statement first.  Then it can examine the objects involved
and find anything that's a floating-point type.  Dates and
times aren't floating-point types.  All the floating-point
types can be coersed up once.  This does require that the
floating-point types be comparable, but that's reasonable.

Yes, this does change the semantics slightly.  Anyone relying
on both the order and results of subexpression evaluation to 
pass information silently between objects should be shot, so I 
don't think it's a big loss.

It also changes the implementation a great deal, which is why
there are no patches attached.  ;)

I'm trying to find a sensible extension outside of statement-
level, but it might be hopeless.  It certainly can't be perfect, 
as there is no way to assume that the types involved are at all
related to the types of the inputs, but there may be a 
reasonable compromise for 90% of the code.  A run-time type 
inferencer?  Think that could be useful well beyond floating-
point expressions.

(IMHO, static and dynamic typing is simply a matter of time 
scale.  Everything is statically typed when it is actually
executed (run through the processor) and dynamically typed 
during development.)

 - Even restricted to floating types, at least two Python GMP 
 - wrappers exist now, at least one of which exposes GMP's 
 - unbounded (but dynamically specified) precision binary fp type.

This should _not_ be mixable with typical floating-point numbers 
without the programmer saying it's ok.  The semantics are entirely
different.  If you want them mixed, you should be promoting 
everything they touch to the arbitrary length types.

Think of the user whose problem is actually in the decimal->binary
conversion.  Or a case like z = a + b + c, where a is a gmp type.
Is it [(a + b) :: gmp-type + c] :: gmp-type, or [a + (b + c) :: 
Python float] :: gmp-type?  Yes, _I_ know it's left-to-right, but 
must everyone be a language lawyer?  What if a user of a routine
had input c as a gmp-type, intending for the internals to use
arbitrary precision?  The routine would have silently used fixed 
precision intermediates.  Makes using these things too painful.

Consider too interval arithmetic.  For intervals to be really 
useful, even the decimal->binary conversions touching an interval 
need to return intervals.  Otherwise, you've violated the 
properties guaranteeing that the answer lies within the result
interval.  Requiring explicit casts to intervals doesn't mesh
well with a dynamically typed language like Python.  That makes
the code much less reusable.  

Actually, with a widest-feasible evaluation strategy, you _can_ 
mix these pretty well.  You'll get the most expected results.
I'm still not comfortable with mixing random-length floats with
fixed-length floats, especially when division, square root, and
elementary functions are involved, but intervals and further
fixed-length floats work beautifully.

On the flip side, complex really shouldn't be silently mixed 
with real at all.  Imagine a beginner trying Cholesky factorizaion
in Python.  It'll seem to run just fine.  Any solve with the 
factors will give only real numbers from real numbers.  But some 
results are completely, SILENTLY wrong.  Why?  The matrix wasn't 
positive definite.  It should have choked, but entries silently 
went imaginary.  Solves will still produce real numbers.

Now imagine debugging a program that has a symmetric solve 
somewhere.  Would you spend much time looking at one of the best-
understood portions?  Or would you end up blowing a huge amount 
of time checking input, data structure conversions, etc.?  Most
programmers would assume it's a ``problem with floating-point
arithmetic.''

There are similar problems with many textbook formulas and
complex arithmetic.  Perhaps people should know better, but how 
often do they?  Wouldn't it be better for Python to help them 
learn?  It somewhat does by separating math and cmath.  This one
isn't something that can be handled with a `widest' evaluation
scheme; silently mixing these types in any way can lead to
horrible bugs.  

Yeah, this goes against the dynamic typing ideal, but it does 
bite students in Matlab...  A block-level parameter that allows 
or disallows mixing with complex may be the most useful, but I'm 
still pondering.

So some attempt at a widest-feasible evaluation scheme would 
greatly help programmers get the results they expect.  It also 
gives tools to avoid silent, incredibly hard to find bugs.  Both
are good things for Python, and worth some consideration.

Jason
--