[SciPy-Dev] Entropy from empirical high-dimensional data
Gael Varoquaux
gael.varoquaux at normalesup.org
Wed May 25 17:35:33 EDT 2011
Hi list,
I am looking at estimating entropy and conditional entropy from data for
which I have only access to observations, and not the underlying
probabilistic laws.
With low dimensional data, I would simply use an empirical estimate of
the probabilities by converting each observation to its quantile, and
then apply the standard formula for entropy (for instance using
scipy.stats.entropy).
However, I have high-dimensional data (~100 features, and 30000
observations). Not only is it harder to convert observations to
probabilities in the empirical law, but I am also worried of curse of
dimensionality effects: density estimation in high-dimension is a
difficult problem.
Does anybody has advices, or code in Python to point to, for this task?
Cheers,
Gaël
More information about the SciPy-Dev
mailing list