[SciPy-User] R: Re: R: Re: R: Re: Epanechnikov kernel

Patrick Marsh patrickmarshwx at gmail.com
Sat Jan 19 12:47:38 EST 2013


I should also add that you can approximate an Epanechnikov kernel with a
Gaussian kernel. See:
http://journals.ametsoc.org/doi/pdf/10.1175/BAMS-D-11-00200.1

The take away line is:

"Using the results of Marron and Nolan (1988), it can be shown that, when
comparing Epanechnikov and Gaussian kernels, the  Epanechnikov kernel must
be 2.2138 times larger than the Gaussian bandwidth to achieve a similar
response function."

So, you can take the bandwidth you'd like to use with the Epanechnikov
kernel, and divide it by 2.2138 and plug the result into the Gaussian
kernel. It's not exact, but the response is similar.


Patrick
---
Patrick Marsh
Ph.D. Candidate / Liaison to the HWT
School of Meteorology / University of Oklahoma
Cooperative Institute for Mesoscale Meteorological Studies
National Severe Storms Laboratory
http://www.patricktmarsh.com


On Sat, Jan 19, 2013 at 11:46 AM, Patrick Marsh <patrickmarshwx at gmail.com>wrote:

> I apologize if this is a duplicate...I used the wrong email initially and
> wasn't sure if it would go through the listserv....
>
>
>
>
> I've previously coded up a Cython version of the Epanechnikov kernel.  You
> can find the function here:
>
> https://gist.github.com/4573808
>
> It's certainly not optimized. It was a quick hack for use with rare
> (spatial) meteorological events. As the grid density increases, the
> performance decreases significantly. At this point, your best bet would be
> to create a grid that has the weights of the Epanechnikov kernel, and do a
> FFT convolve between the two grids. A pseudocode example (that I believe
> should work) is shown below...
>
>
> ============================================
> import numpy as np
> import scipy as sp
> import epanechnikov (from the gist linked to above)
>
> data_to_kde = ... # Your 2D array
>
> # Create a grid with a value of 1 at the midpoint
> raw_epan_grid = np.zeros((51, 51), dtype=np.float64)
> raw_epan_gird[25, 25] = 1
>
> # Convert this binary grid into the weights of the Epanechnikov kernel
> bandwidth = 10
> dx = 1
> epan_kernel = epanechnikov(raw_epan_grid, bandwidth, dx)
>
> # Use FFTCONVOLVE to do the smoothing in Fourier space
> data_smoothed = sp.signal.fftconvolve(data_to_kde, epan_kernel,
> mode='same')
> ============================================
>
>
> This is slower than the function linked above for sparse grids, but faster
> for dense grids. (The runtime of fftconvolve is dependent upon the size of
> your arrays,  not the density.)
>
>
> Hope this helps
> Patrick
>
> ---
> Patrick Marsh
> Ph.D. Candidate / Liaison to the HWT
> School of Meteorology / University of Oklahoma
> Cooperative Institute for Mesoscale Meteorological Studies
> National Severe Storms Laboratory
> http://www.patricktmarsh.com
>
>
> On Sat, Jan 19, 2013 at 9:18 AM, francescoboccacci at libero.it <
> francescoboccacci at libero.it> wrote:
>
>> Thanks Josef, i will investigate on it.
>> I'm using scipy version '0.9.0' so i need to update it.
>> If i have some problems i will ask you again :).
>> Thanks for your time
>>
>> Francesco
>>
>> >----Messaggio originale----
>> >Da: josef.pktd at gmail.com
>> >Data: 19/01/2013 16.06
>> >A: "francescoboccacci at libero.it"<francescoboccacci at libero.it>, "SciPy
>> Users
>> List"<scipy-user at scipy.org>
>> >Ogg: Re: [SciPy-User] R: Re: R: Re: Epanechnikov kernel
>> >
>> >On Sat, Jan 19, 2013 at 9:57 AM, francescoboccacci at libero.it
>> ><francescoboccacci at libero.it> wrote:
>> >> Hi,
>> >> i would like to use a Epanechnikov kernel because i would  like
>> replicate
>> an R
>> >> function that use Epanechnikov kernel.
>> >> Reading in depth a documentation below documentation:
>> >>
>> >>
>> >> http://rgm3.lab.nig.ac.jp/RGM/r_function?p=adehabitatHR&f=kernelUD
>> >>
>> >> i found that i can use normal kernel (i think guaussion kernel).
>> >> Below i write a pieces of my code:
>> >>
>> >>
>> >>                   xmin = min(xPoints)
>> >>                   xmax = max(xPoints)
>> >>                   ymin = min(yPoints)
>> >>                   ymax = max(yPoints)
>> >>                   X,Y = np.mgrid[xmin:xmax:40j, ymin:ymax:40j]
>> >>                   positions = np.vstack([X.ravel(), Y.ravel()])
>> >>                   values = np.vstack([xPoints,yPoints])
>> >>                   # scipy.stats.kde.gaussian_kde --
>> >>                   # Representation of a kernel-density estimate using
>> Gaussian
>> >> kernels.
>> >>                   kernel = stats.kde.gaussian_kde(values)
>> >>
>> >>                   Z = np.reshape(kernel(positions).T, X.T.shape)
>> >>
>> >> If i understood in right way the missing part that i have to implement
>> is
>> the
>> >> smoothing paramter h:
>> >>
>> >> h = Sigma*n^(-1/6)
>> >>
>> >> where
>> >>
>> >> Sigma = 0.5*(sd(x)+sd(y))
>> >>
>> >>
>> >> My new question is:
>> >>
>> >> How can set smooting parameter in stats.kde.gaussian_kde function? is
>> it
>> >> possible?
>> >
>> >In a recent scipy (since 0.10 IIRC) you can directly set the bandwidth
>> >without subclassing
>> >
>> >
>> http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.
>> html#scipy.stats.gaussian_kde<http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html#scipy.stats.gaussian_kde>
>> >
>> http://docs.scipy.org/doc/scipy/reference/tutorial/stats.html#kernel-density-
>> estimation
>> >
>> >Josef
>> >
>> >>
>> >> Thanks
>> >>
>> >> Francesco
>> >>
>> >>
>> >>>----Messaggio originale----
>> >>>Da: jsseabold at gmail.com
>> >>>Data: 19/01/2013 15.21
>> >>>A: "francescoboccacci at libero.it"<francescoboccacci at libero.it>, "SciPy
>> Users
>> >> List"<scipy-user at scipy.org>
>> >>>Ogg: Re: [SciPy-User] R: Re: Epanechnikov kernel
>> >>>
>> >>>On Sat, Jan 19, 2013 at 8:48 AM, francescoboccacci at libero.it
>> >>><francescoboccacci at libero.it> wrote:
>> >>>> Hi,
>> >>>> is there a possibility to multivariate  KDE using Epanechnikov
>> kernel? my
>> >>>> variables are X Y (point position)
>> >>>>
>> >>>
>> >>>As Josef mentioned there is no way for the user to choose the kernel
>> >>>at present. The functionality is there, but it needs to be hooked in
>> >>>with a suitable API. I didn't keep up with these discussions, so I
>> >>>don't know the current status. If it's something you're interested in
>> >>>trying to help with, I'm sure people would be appreciative and you can
>> >>>ping the statsmodels mailing list.
>> >>>
>> >>>Practically though, the reason this hasn't been done yet is that the
>> >>>choice of the kernel is not all that important. Bandwidth selection is
>> >>>the most important variable and other kernels perform similarly given
>> >>>a good bandwidth. Is there any particular reason you want Epanechnikov
>> >>>kernel in particular?
>> >>>
>> >>>Skipper
>> >>>
>> >>>> Thanks
>> >>>>
>> >>>> Francesco
>> >>>>
>> >>>>>----Messaggio originale----
>> >>>>>Da: jsseabold at gmail.com
>> >>>>>Data: 19/01/2013 14.32
>> >>>>>A: "SciPy Users List"<scipy-user at scipy.org>
>> >>>>>Ogg: Re: [SciPy-User] Epanechnikov kernel
>> >>>>>
>> >>>>>On Sat, Jan 19, 2013 at 7:49 AM,  <josef.pktd at gmail.com> wrote:
>> >>>>>> On Sat, Jan 19, 2013 at 6:34 AM, francescoboccacci at libero.it
>> >>>>>> <francescoboccacci at libero.it> wrote:
>> >>>>>>> Hi all,
>> >>>>>>>
>> >>>>>>> I have a question for you. Is it possible in scipy using a
>> Epanechnikov
>> >>>>>>> kernel function?
>> >>>>>>>
>> >>>>>>> I checked on scipy documentation but i found that the only way to
>> >>>> calculate
>> >>>>>>> kernel-density estimate is possible only with using Gaussian
>> kernels?
>> >>>>>>>
>> >>>>>>> Is it true?
>> >>>>>>
>> >>>>>> Yes, kde in scipy.stats only has gaussian_kde
>> >>>>>>
>> >>>>>> Also in statsmodels currently only gaussian is supported for
>> >>>>>> continuous data
>> >>>>>> http://statsmodels.sourceforge.net/devel/nonparametric.html
>> >>>>>> (It was removed because in the references only the bandwidth
>> selection
>> >>>>>> made much difference in the estimation, but not the shape of the
>> >>>>>> kernel. Other kernels for continuous variables will come back
>> >>>>>> eventually.
>> >>>>>
>> >>>>>If you're interested in univariate KDE, then we do have the
>> Epanechnikov
>> >>>> kernel.
>> >>>>>
>> >>>>>http://statsmodels.sourceforge.net/devel/generated/statsmodels.
>> >> nonparametric.
>> >>>>
>> kde.KDEUnivariate.fit.html#statsmodels.nonparametric.kde.KDEUnivariate.
>> fit
>> >>>>>
>> >>>>>Skipper
>> >>>>>_______________________________________________
>> >>>>>SciPy-User mailing list
>> >>>>>SciPy-User at scipy.org
>> >>>>>http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>>>>
>> >>>>
>> >>>>
>> >>>> _______________________________________________
>> >>>> SciPy-User mailing list
>> >>>> SciPy-User at scipy.org
>> >>>> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >>>
>> >>
>> >>
>> >> _______________________________________________
>> >> SciPy-User mailing list
>> >> SciPy-User at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/scipy-user
>> >
>>
>>
>> _______________________________________________
>> SciPy-User mailing list
>> SciPy-User at scipy.org
>> http://mail.scipy.org/mailman/listinfo/scipy-user
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20130119/aac38990/attachment.html>


More information about the SciPy-User mailing list