[Numpy-discussion] A minor clarification no why count_nonzero is faster for boolean arrays

CJ Carey perimosocordiae at gmail.com
Thu Dec 17 13:37:56 EST 2015


I believe this line is the reason:
https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2110

On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V <ragvrv at gmail.com> wrote:

> I was just playing with `count_nonzero` and found it to be significantly
> faster for boolean arrays compared to integer arrays
>
>
>     >>> a = np.random.randint(0, 2, (100, 5))
>     >>> a_bool = a.astype(bool)
>
>     >>> %timeit np.sum(a)
>     100000 loops, best of 3: 5.64 µs per loop
>
>     >>> %timeit np.count_nonzero(a)
>     1000000 loops, best of 3: 1.42 us per loop
>
>     >>> %timeit np.count_nonzero(a_bool)
>     1000000 loops, best of 3: 279 ns per loop (but why?)
>
> I tried looking into the code and dug my way through to this line
> <https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa/numpy/core/src/multiarray/item_selection.c#L2172>.
> I am unable to dig further.
>
> I know this is probably a trivial question, but was wondering if anyone
> could provide insight on why this is so?
>
> Thanks
>
> R
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151217/3964d139/attachment.html>


More information about the NumPy-Discussion mailing list