[Numpy-discussion] adding booleans

josef.pktd at gmail.com josef.pktd at gmail.com
Fri Jun 7 20:29:01 EDT 2013


On Fri, Jun 7, 2013 at 8:08 PM,  <josef.pktd at gmail.com> wrote:
> On Fri, Jun 7, 2013 at 7:48 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On 7 Jun 2013 21:58, <josef.pktd at gmail.com> wrote:
>>>
>>> Interesting observation, (while lurking on a pull request)
>>>
>>> >>> np.add.reduce(np.arange(5)<3)
>>> 3
>>> >>> np.add((np.arange(5)<3), (np.arange(5)<3))
>>> array([ True,  True,  True, False, False], dtype=bool)
>>>
>>>
>>> I often use summing of an array of boolean but didn't know the second
>>> behavior
>>
>> ...yeah weird. My gut reaction is that it's a bug. Addition on bools should
>> either be an error, undefined but doable via an implicit upcast to int
>> (analogous to calling np.sin on an int array triggering an upcast to float),
>> or xor (i.e., addition mod 2). But apparently we're inconsistent -
>> add.reduce upcasts, and add.__call__, uh... upcasts and then downcasts,
>> maybe? It's like if np.sin on an int array returned ints? I can't see how to
>> get the quoted behaviour in any conceptually coherent way. But maybe I'm
>> missing something.
>
> The first case is perfectly good behavior. I always "knew"/assumed
> that in python bool are 0-1 ints with all the calculation rules.
> I only found the second one a surprise (found by Pauli)
>
>>>> reduce(np.add, [ True,  True,  True, False, False])
> True
>>>> reduce(lambda x, y: x+y, [ True,  True,  True, False, False])
> 3
>
>
> The following we use *very* often:
>
> proportion = (x > 0).mean()
> n_valid = isfinite(x).sum()
>
> cond = cond1 * cond2
>
> in python: trick indexing with 0-1 bool
>>>> ["True", "False"][False]
> 'True'

python indexing with np.bool
>>> ["True", "False"][np.bool(False)]
'True'
>>> ["True", "False"][np.bool(True)]
'False'

operation between numbers and bool
>>> a = np.array([ True,  True,  True, False, False])
>>> a * range(5)
array([0, 1, 2, 0, 0])
>>> a * range(1, 6)
array([1, 2, 3, 0, 0])
>>> a + range(5)
array([1, 2, 3, 3, 4])


multiply and maximum don't need to upcast because the result stays 0-1

>>> reduce(np.multiply, [ True,  True,  True, False, False])
False
>>> np.multiply.reduce([ True,  True,  True, False, False])
0

>>> np.maximum.reduce([ True,  True,  True, False, False])
True
>>> np.maximum.accumulate([ True,  True,  True, False, False])
array([ True,  True,  True,  True,  True], dtype=bool)

also fine
>>> np.add.accumulate([ True,  True,  True, False, False])
array([1, 2, 3, 3, 3])

-----------

I think the only "weird" and inconsistent one is that bool1 + bool2
does not upcast to int

Josef

>
> Josef
>
>>
>> -n
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>



More information about the NumPy-Discussion mailing list