[Numpy-discussion] PEP 209: Multi-dimensional Arrays

Wed Feb 14 18:06:23 EST 2001

Paul Barrett writes:
> Rob W. W. Hooft writes:
>  > Being a scientist, I have learned that when you multiply a 
> very accurate
>  > number with a very approximate number, your result is 
> going to be very
>  > approximate, not very accurate! It would thus be more 
> logical to have
>  > Float32*Float64 return a Float32!
> 
> If numeric precision was all that mattered, then you would be correct.
> But numeric range is also important.  I would hate to take the chance
> of overflowing the above multiplication because I stored the 
> result as 
> a Float32, instead of a Float64, even though the Float64 is overkill
> in terms of precision.  FORTRAN has made an attempt to address this
> issue in FORTRAN 9X by allowing the user to indicate the range and
> precision of the calculation.
> 

A number in a floating point representation is not necessarily 
represented inexactly.  The discussion of Barrett and Hooft 
is confusing the distinct concepts of precision and accuracy.
Well worth reading is Kahan's scathing critcism of Java's 
floating-point model, at least some of which relates directly to
that of Python or proposals in PEPs 209 and 228.  
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf
See p18 for "definitions" of precision and accuracy.
There's a lot more material in the literature, on Kahan's web-site,
and the following is an excellent discussion of floating point
arithmetic and the IEEE standards.
http://cch.loria.fr/documentation/IEEE754/ACM/goldberg.pdf

With regard to the treatment of errors:
Correct and detailed handling of floating-point exceptions need
not impact speed, provided that a mechanism is provided to
(en/dis)able each exception.  Users not interested in exceptions
can simply mask them.  I recall relevant prior discussion including 
constructive comments from Tim Peters.  Many modern and efficient
numerical algorithms, and also effective debugging of numerical 
programs that use large datasets, *require* accurate
and prompt identification of exceptions.  Accurate meaning that
the arrays, their indices, the operation, traceback and type of exception
must be reported.  Delayed reporting of errors is not satisfactory 
since operations performed in the interim may destroy valuable data,
or take a very long time (esp. if many exceptions are being generated).
It is probably unreasonable to ask for more than the capabilities 
provided by some subset of the still platform dependent optimizing 
compilers used to implement Python/Numpy, but I don't see why we should 
have much less. 

I would encourage the developers of PEPs 209 and 228 to submit their
designs for review by a panel of professional numerical analysts 
(not just numerically literate programmers or scientists).  
While full IEEE 754 within Python or NumPy may still be just
a pipe-dream (for some at least), we can at least take a step closer.

Robert

Robert Harrison
Pacific Northwest National Laboratory
Richland, Washington 99352
(509) 375-2037
robert.harrison at pnl.gov