[SciPy-User] KDE bandwith selection question

Zachary Pincus zachary.pincus at yale.edu
Mon Feb 14 11:19:37 EST 2011


> I read the Kernel Density Estimation documentation online but I was  
> unable to find any reference to the bandwith selection algorithm (in  
> scipy.stats I mean). My question is, wich kind of algorithm use  
> stats.gaussian_kde to evaluate the bandwith?

The default is "Scott's Factor" (look at the source code), which is  
pretty simplistic but seems to work well.

A while ago I asked where this came from, and Josef did some really  
helpful research. Below is his answer (and my original question etc).

Zach



>> I've been wading through the old literature on gaussian KDE for a
>> little while trying to find a reference for the "Scott's factor"  
>> rule-
>> of-thumb for gaussian KDE bandwidth selection (n**(-1/(d+4)), where n
>> is the number of data points and d their dimension; this factor is
>> multiplied by the covariance matrix to yield the bandwidths).
>>
>> I can find a lot of Scott's later contributions of fancier methods,
>> but nothing about this basic one...
>
> Scotts 1992 is the reference in Haerdle
>
> http://books.google.com/books?id=qPCmAOS-CoMC&pg=PA73&lpg=PA73&dq=scott%27s+factor+rule-+of-thumb+hardle&source=bl&ots=kTNHJpyk6w&sig=5wwCOzThGsIzXOyVax2AbKQ11Rw&hl=en&ei=MOwlTdC3F4aBlAeRsZDNAQ&sa=X&oi=book_result&ct=result&resnum=1&sqi=2&ved=0CBYQ6AEwAA#v 
> =onepage&q&f=false
>
> Haerdle's book is also online, but I need to look for the link.
>
> Josef

I think it's equation (3.70) in
http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/spm/spmhtmlnode18.html

with page reference to scott 92 p 152

more online Haerdle is here http://fedc.wiwi.hu-berlin.de/xplore/ebooks/html/

Josef






More information about the SciPy-User mailing list