[Numpy-discussion] nan_to_num and bool arrays
Keith Goodman
kwgoodman at gmail.com
Fri Dec 11 19:03:55 EST 2009
On Fri, Dec 11, 2009 at 3:44 PM, Keith Goodman <kwgoodman at gmail.com> wrote:
> On Fri, Dec 11, 2009 at 2:22 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> On Fri, Dec 11, 2009 at 16:09, Keith Goodman <kwgoodman at gmail.com> wrote:
>>> On Fri, Dec 11, 2009 at 1:14 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>>> On Fri, Dec 11, 2009 at 14:41, Keith Goodman <kwgoodman at gmail.com> wrote:
>>>>> On Fri, Dec 11, 2009 at 12:08 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>>
>>>>>> So I agree that it should leave the input untouched when a non-float
>>>>>> dtype is used for some array-like input.
>>>>>
>>>>> Would only one line need to be changed? Would changing
>>>>>
>>>>> if not issubclass(t, _nx.integer):
>>>>>
>>>>> to
>>>>>
>>>>> if not issubclass(t, _nx.integer) and not issubclass(t, _nx.bool_):
>>>>>
>>>>> do the trick?
>>>>
>>>> That still leaves strings, voids, and objects. I recommend:
>>>>
>>>> if issubclass(t, _nx.inexact):
>>>>
>>>> Arguably, one should handle nan float objects in object arrays and
>>>> float columns in structured arrays, but the current code does not
>>>> handle either of those anyways.
>>>
>>> Without your change both
>>>
>>>>> np.nan_to_num(np.array([True, False]))
>>>>> np.nan_to_num([1])
>>>
>>> raise exceptions. With your change:
>>>
>>>>> np.nan_to_num(np.array([True, False]))
>>> array([ True, False], dtype=bool)
>>>>> np.nan_to_num([1])
>>> array([1])
>>
>> I think this is correct, though the latter one happens by accident.
>> Lists don't have a .dtype attribute so obj2sctype(type([1])) is
>> checked and happens to be object_. The latter line is intended to
>> handle scalars, not sequences. I think that sequences should be
>> coerced to arrays for output and this check should be more explicit
>> about what it handles. [1.0] will have a problem if you don't.
>
> That makes sense. But I'm not smart enough to implement it.
>
>>> On a separate note, this seems a little awkward:
>>>
>>>>> np.nan_to_num(1.0)
>>> 1.0
>>>>> np.nan_to_num(1)
>>> array(1)
>>>>> x = np.ones(1, dtype=np.int)
>>>>> np.nan_to_num(x[0])
>>> 1
>>
>> Worth fixing.
>
> Would this work?
>
> def nan_to_num(x):
> try:
> t = x.dtype.type
> except AttributeError:
> t = obj2sctype(type(x))
> if issubclass(t, _nx.complexfloating):
> return nan_to_num(x.real) + 1j * nan_to_num(x.imag)
> else:
> try:
> y = x.copy()
> except AttributeError:
> y = array(x)
> if not y.shape:
> y = array([x])
> scalar = True
> else:
> scalar = False
> if issubclass(t, _nx.inexact):
> are_inf = isposinf(y)
> are_neg_inf = isneginf(y)
> are_nan = isnan(y)
> maxf, minf = _getmaxmin(y.dtype.type)
> y[are_nan] = 0
> y[are_inf] = maxf
> y[are_neg_inf] = minf
> if scalar:
> y = y[0]
> return y
>
> Instead of
>
>>> nan_to_num(1.0)
> 1.0
>>> nan_to_num(1)
> array(1)
>>> nan_to_num(np.array(1.0))
> 1.0
>>> nan_to_num(np.array(1))
> array(1)
>
> it gives
>
>>> nan_to_num(1.0)
> 1.0
>>> nan_to_num(1)
> 1
>>> nan_to_num(np.array(1.0))
> 1.0
>>> nan_to_num(np.array(1))
> 1
>
> I guess a lot of unit tests need to be written before nan_to_num can
> be fixed. But for now, your bool fix is an improvement.
Ack! The "if issubclass(t, _nx.inexact)" fix doesn't work. It solves
the bool problem but it introduces its own problem since numpy.object_
is not a subclass of inexact:
>> nan_to_num([np.inf])
array([ Inf])
Yeah, way too many special cases here to do this without full unit
test coverage.
More information about the NumPy-Discussion
mailing list