[Numpy-discussion] Nice float -> integer conversion?

Sat Oct 15 15:20:51 EDT 2011

Hi,

On Tue, Oct 11, 2011 at 7:32 PM, Benjamin Root <ben.root at ou.edu> wrote:
> On Tue, Oct 11, 2011 at 2:06 PM, Derek Homeier
> <derek at astro.physik.uni-goettingen.de> wrote:
>>
>> On 11 Oct 2011, at 20:06, Matthew Brett wrote:
>>
>> > Have I missed a fast way of doing nice float to integer conversion?
>> >
>> > By nice I mean, rounding to the nearest integer, converting NaN to 0,
>> > inf, -inf to the max and min of the integer range?  The astype method
>> > and cast functions don't do what I need here:
>> >
>> > In [40]: np.array([1.6, np.nan, np.inf, -np.inf]).astype(np.int16)
>> > Out[40]: array([1, 0, 0, 0], dtype=int16)
>> >
>> > In [41]: np.cast[np.int16](np.array([1.6, np.nan, np.inf, -np.inf]))
>> > Out[41]: array([1, 0, 0, 0], dtype=int16)
>> >
>> > Have I missed something obvious?
>>
>> np.[a]round comes closer to what you wish (is there consensus
>> that NaN should map to 0?), but not quite there, and it's not really
>> consistent either!
>>
>
> In a way, there is already consensus in the code.  np.nan_to_num() by
> default converts nans to zero, and the infinities go to very large and very
> small.
>
>     >>> np.set_printoptions(precision=8)
>     >>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>     >>> np.nan_to_num(x)
>     array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
>             -1.28000000e+002,   1.28000000e+002])

Right - but - we'd still need to round, and take care of the nasty
issue of thresholding:

>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> x
array([  inf,  -inf,   nan, -128.,  128.])
>>> nnx = np.nan_to_num(x)
>>> nnx

array([  1.79769313e+308,  -1.79769313e+308,   0.00000000e+000,
        -1.28000000e+002,   1.28000000e+002])
>>> np.rint(nnx).astype(np.int8)
array([   0,    0,    0, -128, -128], dtype=int8)

So, I think nice_round would look something like:

def nice_round(arr, out_type):
    in_type = arr.dtype.type
    mx = floor_exact(np.iinfo(out_type).max, in_type)
    mn = floor_exact(np.iinfo(out_type).max, in_type)
    nans = np.isnan(arr)
    out = np.rint(np.clip(arr, mn, mx)).astype(out_type)
    out[nans] = 0
    return out

with floor_exact being something like:

https://github.com/matthew-brett/nibabel/blob/range-dtype-conversions/nibabel/floating.py

See you,

Matthew