[Numpy-discussion] #2522 numpy.diff fails on unsigned integers

Tue Nov 4 13:44:36 EST 2014

On Tue, Nov 4, 2014 at 11:19 AM, Sebastian <sebix at sebix.at> wrote:

> On 2014-11-04 15:06, Todd wrote:
> > On Tue, Nov 4, 2014 at 2:50 PM, Sebastian Wagner <sebix at sebix.at
> > <mailto:sebix at sebix.at>> wrote:
> >
> >     Hello,
> >
> >     I want to bring up Issue #2522 'numpy.diff fails on unsigned integers
> >     (Trac #1929)' [1], as it was resonsible for an error in one of our
> >     programs. Short explanation of the bug: np.diff performs a
> subtraction
> >     on the input array. If this is of type uint and the data contains
> >     falling data, it results in an artihmetic underflow.
> >
> >     >>> np.diff(np.array([0,1,0], dtype=np.uint8))
> >     array([  1, 255], dtype=uint8)
> >
> >     @charris proposed either
> >     - a note to the doc string and maybe an example to clarify things
> >     - or raise a warning
> >     but with a discussion on the list.
> >
> >     I would like to start it now, as it is an error which is not easily
> >     detectable (no errors or warnings are thrown). In our case the
> >     type of a
> >     data sequence, with only zeros and ones, had type f8 as also every
> >     other
> >     one, has been changed to u4. As the programs looked for values ==1
> and
> >     ==-1, it broke silently.
> >     In my opinion, a note in the docs is not enough and does not help
> >     if the
> >     type changed or set after the program has been written.
> >     I'd go for automatic upcasting of uints by default and an option
> >     to turn
> >     it off, if this behavior is explicitly wanted. This wouldn't be
> >     correct
> >     from the point of view of a programmer, but as most of the users
> >     have a
> >     scientific background who excpect it 'to work', instead of sth is
> >     theoretically correct but not convenient. (I count myself to the
> first
> >     group)
> >
> >
> >
> > When you say "automatic upcasting", that would be, for example uint8
> > to int16?  What about for uint64?  There is no int128.
> The upcast should go to the next bigger, otherwise it would again result
> in wrong values. uint64 we can't do that, so it has to stay.
> > Also, when you say "by default", is this only when an overflow is
> > detected, or always?
> I don't know how I could detect an overflow in the diff-function. In
> subtraction it should be possible, but that's very deep in the
> numpy-internals.
> > How would the option to turn it off be implemented?  An argument to
> > np.diff or some sort of global option?
> I thought of a parameter upcast_int=True for the function.
>

Could check for non-decreasing sequence in the unsigned case. Note that
differences of signed integers can also overflow. One way to check in
general is to determine the expected sign using comparisons.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20141104/49be0a8a/attachment.html>