[SciPy-dev] Logical operations, Numeric.sum() and overflow

Tue Apr 19 13:03:00 EDT 2005

On Tue, 2005-04-19 at 18:48 +0200, Ed Schofield wrote:
> Hi all,
> 
> Importing 'scipy' changes the output of the following code:
> 
> >>>import random
> >>>import Numeric, RandomArray
> >>>(m,n) = (1000,3)
> >>>x = RandomArray.random((m,n))
> >>>y = x < 0.5
> >>>assert sum(y) == Numeric.sum(y)
> 
> from nothing to an AssertionError.
> 
> This is random behaviour: the error occurs about 90% of the time with
> this value of m on my PC (NumPy 23.1, SciPy 0.3.2, Python 2.4.1, Linux
> 2.6.11) , but reducing m to 500 makes it occur only about 20% of the time.
> 
> There appear to be two causes:
> (1) importing scipy changes the behaviour of "y = x < 0.5" to return an
> array of typecode 'b' rather than 'l'.
> (2) Numeric.sum() is prone to overflow errors, returning an array of
> type 'b' rather than increasing precision:
> 
> >>> a = Numeric.array([[253,254,255],[1,1,1]],'b')
> >>> Numeric.sum(a)
> array([254, 255,   0],'b')
> 
> Here's my two cents.  On point (1), unit tests are needed to ensure a
> simple 'import scipy' can't change the behaviour of unrelated Numpy
> code.  (How is this even possible?)
> 
> Point (2) seems to indicate a design flaw with Numeric.  How do Octave
> and Matlab deal with this?  Whatever they do, it "just works", whereas
> Numeric feels "broken" in this respect; this overflow propagates through
> other operations (Numeric.average() in my case), and finding such bugs
> can take hours.  Any suggestions / ideas?
> 
> 
> -- Ed

Numarray uses the byte boolean types also and has given me similar
problems. I haven't felt the need to complain, but it is annoying. I
suppose the byte type saves a bit of space, but frankly I think it would
be better to stick to plain old integers as c does.

chuck