better way to do this in Numeric

Todd Miller jmiller at stsci.edu
Mon Aug 4 14:16:37 EDT 2003


John Hunter wrote:
> I have a list of indices and want to assign the number of times an
> index shows up in the list to an array with the count specified at the
> index; ie,
> 
>     from Numeric import *
> 
>     a = zeros((10,), Int)
>     ind = [1,1,4,4,4,4,7,7,9,9,9,9,9]
> 
>     for i in ind:
>         a[i] += 1
> 
> I'm wondering if there is a way to use some combination of Numeric
> array functions to make this speedy.
> 
> Thanks,
> John Hunter
> 

There's a simple histogram function defined in the Numeric manual, based 
on the searchsorted function.

 >>> def histogram(a, bins):
...     n = searchsorted(sort(a), bins)
...     n = concatenate([n, [len(a)]])
...     return n[1:]-n[:-1]


If you set bins:

 >>> bins = arange(0.5,10.5,1)

I think it does what you want:

 >>> ind = [1,1,4,4,4,4,7,7,9,9,9,9,9]
 >>> histogram(ind, bins)
array([0, 0, 2, 0, 0, 4, 0, 0, 2, 0, 5])

In numarray CVS, histogram improves performance by a factor of ~7x 
versus the 2 lines of code you defined the problem with.  (I tried this 
with test cases of 10**5 and 10**6 elements.)

Sadly,  Numeric has much slower sort functions than numarray so 
histogram winds up being a net loss of ~2x.

I think you probably need an extension function to get really good 
performance.

Todd





More information about the Python-list mailing list