[SciPy-dev] operations on int8 arrays

Ed Schofield schofield at ftw.at
Wed Oct 19 18:24:21 EDT 2005



On Wed, 19 Oct 2005, Travis Oliphant wrote:

> Jon Peirce wrote:
>
> >>Scipy arrays with dtype=uint8 or int8 seem to be
> >>mathematically-challenged on my machine (AMD64 WinXP running python
> >>2.4.2, scipy core 0.4.1). Simple int (and various others) appear fine.
> >>>>> >>>import scipy
> >>>>> >>>xx=scipy.array([100,100,100],scipy.int8)
> >>>>> >>>print xx.sum()
> >> 44
> >>>>> >>>xx=scipy.array([100,100,100],scipy.int)
> >>>>> >>>print xx.sum()
> >> 300
>
> This is not a bug.   In the first line, you are telling the computer to
> add up 8-bit integers.  The result does not fit in an 8-bit integer ---
> thus you are computing modulo 256.


I was bitten by this back in April:
http://www.scipy.org/mailinglists/mailman?fn=scipy-dev/2005-April/002937.html

I wasted several hours then trying to hunt down bugs in my code, before I
finally realized that my sum() call was responsible.  I strongly believe
that the default should be changed here to upcast by default.  My reasons
are:

1. Python would do the same: it 'just works', upcasting where necessary
from int to big integer and, in the future, making division with two int
arguments return a float.  We also want to avoid differences between
Python's sum() and scipy's sum():

>>> a = scipy.array([100,100, 100], scipy.int8)
>>> sum(a)
300
>>> scipy.sum(a)
44

2. the result of sum() or mean() without any modulo arithmetic would be a
python int or float, and it seems reasonable that the result is accurate
to the width of the output type.

3. the advantage in space efficiency of using a smaller type for
accumulated operations is minimal (perhaps unlike an operation whose output
is an array).

> It would be possible to make the default reduce type for integers 32-bit
> on 32-bit platforms and 64-bit on 64-bit platforms.  the long integer type.

As far as I understand, a Python int is always a C long, but a C long
isn't always the platform word length (e.g. is sometimes 32 bit on 64 bit
machines).  So perhaps it'd be better to make the default reduce type for
integers a C long?


> Or, this could simply be the default when calling the .sum method (which
> is add.reduce under the covers).  The reduce method could stay with the
> default of the integer type.

I think reduce should upcast too.



-- Ed




More information about the SciPy-Dev mailing list