[Numpy-discussion] UFunc out argument not forcing high precision loop?

Fri Sep 27 18:02:01 EDT 2019

On Fri, 2019-09-27 at 11:50 -0700, Sebastian Berg wrote:
> Hi all,
> 
> Looking at the ufunc dispatching rules with an `out` argument, I was
> a
> bit surprised to realize this little gem is how things work:
> 
> ```
> arr = np.arange(10, dtype=np.uint16) + 2**15
> print(arr)
> # array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18], dtype=uint16)
> 

Whoops, copied that print wrong of course.

Just to be clear, I personally will consider this an accuracy/precision
bug and assume that we can just switch the behaviour failry
unceremoniously at some point (and if someone feels that should be a
major release, I do not mind).
It seems like one of those things that will definitely fix some bugs
but could break the odd system/assumption somewhere. Similar to fixing
the memory overlap issues.

- Sebastian

> out = np.zeros(10)
> 
> np.add(arr, arr, out=out)
> print(repr(out))
> # array([ 0.,  2.,  4.,  6.,  8., 10., 12., 14., 16., 18.])
> ```
> 
> This is strictly speaking correct/consistent. What the ufunc tries to
> ensure is that whatever the loop produces fits into `out`.
> However, I still find it unexpected that it does not pick the full
> precision loop.
> 
> There is currently only one way to achieve that, and this by using
> `dtype=out.dtype` (or similar incarnations) which specify the exact
> dtype [0].
> 
> Of course this is also because I would like to simplify things for a
> new dispatching system, but I would like to propose to disable the
> above behaviour. This would mean:
> 
> ```
> # make the call:
> np.add(arr, arr, out=out)
> 
> # Equivalent to the current [1]:
> np.add(arr, arr, out=out, dtype=(None, None, out.dtype))
> 
> # Getting the old behaviour requires (assuming inputs have same
> dtype):
> np.add(arr, arr, out=out, dtypes=arr.dtype)
> ```
> 
> and thus force the high precision loop. In very rare cases, this
> could
> lead to no loop being found.
> 
> The main incompatibility is if someone actually makes use of the
> above
> (integer over/underflow) behaviour, but wants to store it in a higher
> precision array.
> 
> I personally currently think we should change it, but am curious if
> we
> think that we may be able to get away with an accelerate process and
> not a year long FutureWarning.
> 
> Cheers,
> 
> Sebastian
> 
> 
> [0] You can also use `casting="no"` but in all relevant cases that
> should find no loop, since the we typically only have homogeneous
> loop
> definitions, and
> 
> [1] Which is normally the same as the shorter spelling
> `dtype=out.dtype` of course.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at python.org
> https://mail.python.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20190927/a5997a1f/attachment.sig>