[Matrix-SIG] Proposal for further single-precision clean-up.

Travis Oliphant Oliphant.Travis@mayo.edu
Sun, 16 Jan 2000 10:21:41 -0600 (CST)


> 
> As an astronomer, I would love to increase my use of NumPy  -- but have
> proceded very slowly due to the "single precision" problems.  I often
> need to access _huge_ arrays of single precision floats (generated by
> Fortran or C code), and the upcasting to double is a killer for my
> appliations.

It would benefit me if you could post an explanation and/or examples of
what you think the "single precision" problem is.  I have been thinking of
ways to improve the situation according to the difficulties, I've
encountered in my own work, but it would be beneficial if others posted
example code.  

With the latest changes to NumPy (Release 15.X) one can define single
precision scalars as NumPy arrays of rank-0 and they will not be converted
behind the scenes to double-precision Python scalars, somewhere along the
line causing an upcast.  

The one problem remaining that I see (let me know if you see others), is
that retrieving an element from a single precision array still returns
a double-precision Python scalar.  This would still cause upcasting.  I
think this could be fixed and is what I proposed to do in my previous
post.

Another idea I've tossed around with  a colleague is to introduce the
concept of passive casting.  The idea is to introduce another flag that
allows the user to specify that an array is passive in it's casting
requirements.  So, if a passive-casting array operates with an
active-casting array, the active-casting array always determines the type.
If two passive-casting arrays operate on each other, the normal rules
apply and a passive-casting array is returned.  The other piece that would
make  this useful is to make all Python scalars automatically
passive-casting (since their precision cannot be specified).

> 
> Since I am not terribly familiar with the NumPy code base (and I'm sure
> that many on this list aren't either), I would like to request that
> someone more knowlegable please post a summary of the pros and cons for
> treating single precision scalars as rank-0 arrays (or as something
> different -- i.e. a new Python intrinsic).

Python allows you to define new types that behave in nearly all respects
just like Python scalars.  Much of C-code in Python is written so that if
the C-routine expects say a Python integer it checks to see if the object
is an integer and if not it checks to see if the object can act like an
integer (i.e. it returns a Python Integer if you say in Python
int(object)).   There are a few places in the code where this is not
followed and the code must have a Python integer (list indexing is an
example --- I don't know why this couldn't be changed, however).   Because
Python does not itself define the concept of single-precision scalars, we
can and should decide on a Numerical Python definition of them.

Right now rank-0 arrays fit as a definition for single-precision scalars
very well and with a few changes could be a consisten view throughout the 
current code. There is significant, unnecessary overhead for rank-0
arrays, however, that could be a practical argument for a separate scalar
type that supports the concept of single-precision.   This would probably
also make some folks happy to separate the concepts of scalars and rank-0
arrays.   Such a type could later become a unifying concept for all of
Python's scalars (it would necessarily have more overhead than curent
implementation of Integers though which may make it hard for Guido to buy
into)

But, defining a new type would mean more code---very easy to write
code---but time consuming.  This code would also increase the size of
Numerical Python.  I don't mind that but I'm not sure of others' opinions

> 
> I am really interested to learn what the differences in functionality
> and performance would be when using the various different
> implementations on NumPy arrays -- using ufuncs, the Python math module,
> matrix operations, casting, etc.

Currently, all Python scalars are converted to rank-0 arrays before any
operations are done in NumPy, so there are no real performance issues.
The Python math module only works with Python scalars so you would have to
convert any single-precision array to a Python scalar (using int(),
float(), etc.) before using one of those).  Of course the Python math
module is faster for a scalar than ufuncs are for a scalar computation
(there is more overhead for the ufunc).

> I am highly encouraged by the recent interest in moving NumPy forward,
> and with the correct changes, I think the user base among Scientists
> will increase dramatically.

I'm glad to hear that.  I would like to see the user base increase, too.

Travis