[SciPy-User] Weighted KDE

Joe Kington joferkington at gmail.com
Mon Jan 14 14:58:08 EST 2013


On Jan 14, 2013 11:31 AM, "Jackson Li" <sonicboomed at yahoo.com> wrote:
>
> On Sun, Jan 13, 2013 at 10:44 AM, Joe Kington <joferkington at gmail.com
>wrote:
>
> > For what it's worth, the code you linked to is much slower for small
> > sample sizes. It's only faster with large numbers (>1e4) of points.  It
> > also has a bit of a different use case than gaussian_kde.  It's only
> > intended for making a regularly gridded KDE of a very large number of
> > points on a relatively fine grid. It bins the data onto a regular grid
and
> > convolves it with an approriate gaussian kernel.  This is a reasonable
> > approximation when you're dealing with a large number of points, but
not so
> > reasonable if you only have a handful.  Because the size of the gaussian
> > kernel can be very large when the sample size is low, the convolution
can
> > be very slow for small sample sizes.  Also, If I recall correctly,
there's
> > a stray flipud that got left in there. You'll want to take it out.
(Also,
> > while I think that got posted only a couple of years ago, I wrote it
much
> > longer ago than that... There's some less-than-ideal code in there...)
> >
> > However, are you sure that you want a kernel density estimate?  What
> > you're describing sounds like interpolation, not a weighted KDE.
> >
> > As an example, a weighted KDE would be used when you wanted to show the
> > density of point estimates while weighting it by error in the location
of
> > the point.
> >
>
> >>I shouldn't have said "error in the location of the point". I guess it
> >>would me more like "confidence that the point exists" or more
accurately,
> >>"magnitude of the point". Otherwise, the size of the Gaussian kernel
would
> >>have to change depending on the data involved.
>
> >>As another (not exact) example, it can be handy when you want to sum
some
> >>attribute over a map to yield a density estimate per-unit-area (e.g.
> >>population density, where you have populations of cities as your point
> >>measurements). In other words, if you want your temperature values to be
> >>summed-per-unit-area, then it's what you want. If you want to
interpolate,
> >>it's not what you want.
>
>
> >
> > Instead, it sounds like you have a third variable that you want to make
a
> > continuous map of based on irregularly sampled points.  If so, have a
look
> > at scipy.interpolate (and particularly scipy.interpolate.Rbf).
> >
> > Hope that helps,
> > -Joe
>
>
> Hi,
>
> Thanks for the quick reply.
>
> What you described for the population of cities is indeed what I want.
>
> I have several data points spread out randomly in XY space, and each data
point has an independent third variable.
>
> (e.g. for 2 points very close to each other, one 50 and another 10, and
all other data points are far away.

You're describing interpolation, for whatever it's worth.

You want to interpolate your "z" values, not determine the number of
samples you have per unit area.

A KDE will give you "bulls eyes" around where you have data and the
resulting values won't directly reflect the weight values you pass in.
Instead, the values will mostly reflect where you have clusters of point
measurements, modified by the localized sum of the weights. The exact value
you get will depend on the covariance of your sampled point distribution.

Instead, you want a smooth surface that reflects your sampled z values.

Have a look at some of the examples involving scipy.interpolate.griddata or
scipy.interpolate.Rbf.

The cookbook is a bit out of date, but take a look at the second example on
this page: http://www.scipy.org/Cookbook/RadialBasisFunctions

Hope that helps!
-Joe

>
> --> I would like that patch to get a value of 30 (average))
>
>
> Hence, I would like to obtain a XY graph showing the density estimate of
the third variable.
>
> (if that patch is mostly high temperature on average, it should be "red",
and if it is empty or has a lot of low temperature data points, then it
should be "blue".)
>
>
> Thank
>  you!
>
> Jackson
>
>
>
>
>
>
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130114/dd864bd9/attachment.html>


More information about the SciPy-User mailing list