[SciPy-dev] problems with amin/amax when nan present

Fri Aug 16 16:05:27 EDT 2002

On one level, it isn't that hard to put in the C, because we already
have an altered version of the pertinent Numeric file in our CVS
(fastumath).  This was necessary to allow SciPy to handle NaNs within
arrays without throwing errors because, as you noted, Numeric doesn't
support this.  So, we can add the isnan() check to maximum/minimum
functions in fastumath without touching Numeric.  But, for now, I think
we can stick with the "simple" versions I wrote, and move it to C later
if it is clogging a critical application.  I'll check them into
scipy_base.  I guess we could also add a flag, but I'm not excited about
adding such options to standard routines.

As for numarray, I'm all for moving to it as soon as it is feasible.
Right now, there is still a large amount of optimization required before
it can replace Numeric in SciPy.  Here are some timings comparing
Numeric to the latest numarray (0.3.6):

>>> def time_add(a,N):
... 	X = range(N)
... 	t1 = time.time()
... 	for i in X:
... 		b = a + a
... 	t2 = time.time()
... 	return t2-t1
... 
>>> a = Numeric.arange(10000,typecode=Numeric.Float32)
>>> time_add(a,100)
0.019999980926513672
>>> a = numarray.arange(10000,type=numarray.Float32)
>>> time_add(a,100)
0.13099992275238037
>>> a = Numeric.arange(100,typecode=Numeric.Float32)
>>> time_add(a,10000)
0.060000061988830566
>>> a = numarray.arange(100,type=numarray.Float32)
>>> time_add(a,10000)
10.025000095367432

So for small arrays, you can see the price is high, and, even for medium
sized arrays, there is still 6x slow down.  It should be noted that
optimization isn't STSci's goal right now -- there are plenty of other
things to work on as far as getting the rest of the compatibility issues
and new features working.  I have high hopes that numarray will become
as fast as, or faster than Numeric in the future, but I haven't looked
hard at what this entails.  I also think it will not happen in the near
term.  Perry may have more comments on this.

eric

> -----Original Message-----
> From: scipy-dev-admin at scipy.net [mailto:scipy-dev-admin at scipy.net] On
> Behalf Of Pearu Peterson
> Sent: Friday, August 16, 2002 10:50 AM
> To: scipy-dev at scipy.net
> Subject: Re: [SciPy-dev] problems with amin/amax when nan present
> 
> 
> On Fri, 16 Aug 2002, eric wrote:
> 
> > So, I've found an ugly fix listed at the end of this message.  It
will
> > make amin/amax slower (probably 2X) even for the case when NaNs
aren't
> > present because it has to test for if any NaNs exists.  Correct is
more
> > important than fast, but I was wondering if anyone has a better
> > solution.  We could test within the minimum/maximum methods in C
using
> > isnan(), I suppose, to save the extra array creation.
> 
> Does it makes sense to fix this in Numeric? I am not sure if Numeric
is
> supposed to support NaNs but considering that soon (?) Numeric will be
> unmaintained (see Numarray's design) and (I guess) it would be rather
> difficult to make the transformation from Numeric to Numarray quickly,
> then
> we could start taking over Numeric array stuff by fixing this kind of
> issues. It sounds radical (which I don't like) and I really hope that
> starting to use Numarray in SciPy would be easier (for me it would
mean
> implementing Numarray support for f2py and here I have no idea how
> difficult it turns out to be).
> 
> On the other hand, amax and amin are probably not heavily used in
large
> scale calculations so that the above would be no worth of trouble.
> May be amin/amax should have an extra flag for disabling isnan
checking?
> 
> Just some random thoughts ...
> 	Pearu
> 
> _______________________________________________
> Scipy-dev mailing list
> Scipy-dev at scipy.net
> http://www.scipy.net/mailman/listinfo/scipy-dev