entropy

Oh Kyu Yoon okyoon at stanford.edu
Wed Mar 17 07:31:08 EST 2004


I am only a newbie in python, but I can say a few things about entropy.

Since a gaussian distribution is a continuous random variable,
you have to use "differential entropy" instead of "entropy".
They are very similar except that for "differential entropy", you need to
integrate rather than sum.
If 'f'' is the gaussian distribution, differential entropy h(X) = -
integrate(f*log(f) dx).
The distribution has to be normalized by the area( integrate(f dx) = 1).

I think if you normalize by the area, you'll get similar results.

Ohkyu


"John Hunter" <jdhunter at ace.bsd.uchicago.edu> wrote in message
news:mailman.27.1079374026.12241.python-list at python.org...
>
> I am trying to compute the entropy of a time series (eg,
> http://en.wikipedia.org/wiki/Information_theory) using
>
> S = - sum p_i log2(p_i)
>
> According to the text I am using, the entropy of a gaussian
> distribution should be
>
> 1/2 log2(2 pi e sigma^2)
>
> so I am using this result to test my algorithm.  Unfortunately, I am
> not getting the results to agree.
>
> Can anyone tell me where I am going wrong?
>
>
>
> from Numeric import searchsorted, concatenate, arange, nonzero, log, \
>      sum, multiply, sort, greater, take, pi, exp
>
> from MLab import diff, randn
>
> def hist(y, bins):
>     n = searchsorted(sort(y), bins)
>     n = diff(concatenate([n, [len(y)]]))
>     return n
>
> # generate some gaussian numbers
> mu = 0.0
> sigma = 2.0
> x = mu + sigma*randn(100000)
>
> delta = 0.001
> bins = arange(-12.0, 12.0, delta)
>
> n = hist(x, bins)
>
> ind = nonzero(greater(n, 0.0))
> n = take(n, ind)         # get the positive
> n = 1.0/len(n)*n         # norm for probability; is this the right
normalization
> #n = 1.0/len(bins)*n     # or this? or something else?
>
> Scomputed = -1.0/log(2.0) * sum(multiply(n, log(n)))
> Sanalytic = 0.5/log(2.0) * log(2*pi*exp(1.0)*sigma**2)
>
> print Scomputed, Sanalytic
>
>
>
> Thanks!
> John Hunter
>





More information about the Python-list mailing list