[Numpy-discussion] histogram: sum up values in each bin

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Aug 27 09:19:15 EDT 2009


On Thu, Aug 27, 2009 at 8:23 AM, alexander
baker<baker.alexander at gmail.com> wrote:
> Here is an example, this does something a extra at the end but shows how the
> bins can be used.
>
> Regards
>
> Alex Baker.
>
> from scipy.stats import norm
> r = norm.rvs(size=10000)
>
> import numpy as np
> p, bins = np.histogram(r, width, normed=True)
> db = bins[1]-bins[0]
> cdf = np.cumsum(p*db)
>
> from pylab import figure, show
> fig = figure()
> ax = fig.add_subplot(111)
> ax.bar(bins[:-1], cdf, width=0.8*db)
> show()
>
> o = []
> rates = []
> for r in np.arange(0, max(bins), db):
>     G = max(np.cumsum([bin for bin in bins if bin > r]))
>     L = min(np.cumsum([bin for bin in bins if bin < r]))
>     o.append(abs(G/L))
>     rates.append(r)
>
> Mobile: 07788 872118
> Blog: www.alexfb.com
>
> --
> All science is either physics or stamp collecting.
>
>
> 2009/8/27 Tim Michelsen <timmichelsen at gmx-topmail.de>
>>
>> Hello,
>> I need some advice on histograms.
>> If I interpret the documentation [1, 2] for numpy.histogram correctly, the
>> result of the function is a count of the occurences sorted into each bin.
>>
>> (n, bins) = numpy.histogram(v, bins=50, normed=1)
>>
>> But how can I apply another function on these values stacked in each bin?
>> Like summing them up or building averages?
>>
>> Thanks,
>> Timmie
>>
>> [1]
>> http://docs.scipy.org/doc/numpy/reference/generated/numpy.histogram.html
>> [2]
>>
>> http://www.scipy.org/Tentative_NumPy_Tutorial#head-aa75ec76530ff51a2e98071adb7224a4b793519e
>>

Tim, do you mean, that you want to apply other functions, e.g. mean or
variance, to the original values but calculated per bin?

If I read the answer of Alex correctly, then it only works with the bin count.
To calculate e.g. the variance of all values per bin, I think, the
easiest would be to create a label array, with values arange(nbins-1)
for the corresponding original data and then use np.bincount. I don't
know straight away what the easiest or fastest way is to create the
label array from the histogram bin boundaries

Josef



More information about the NumPy-Discussion mailing list