[SciPy-User] Robust fitting of an exponential distribution subpopulation

Antonino Ingargiola tritemio at gmail.com
Wed Mar 11 13:04:58 EDT 2015


Hi to the list,

I'm seeking the advise of the scientific python community to solve the
following fitting problem. Both suggestions on the methodology and on
particular software packages are appreciated.

I often encounter the need to fit a sample containing a (dominant)
exponentially-distributed sub-population. Mostly the non-exponential
samples (from an unknown distribution) are distributed close to the origin
of the exponential distribution, therefore a simple approach I used so far
is selecting all the samples higher than a threshold and fitting the
exponential "tail" with MLE.

The problem is that the choice of the threshold is somewhat arbitrary and
moreover there can be a small set of outlier on the extreme right-side of
the distribution that would bias the MLE fit.

To improve the accuracy, I'm thinking of using (if necessary implementing)
some kind of robust fitting procedure. For example using a scheme in which
the outlier are identified by putting a threshold on the residual and then
this threshold is optimized using some "goodness of fit" cost function. If
this approach reasonable?

I am surely not the first to tackle this problem, so I would appreciated
some suggestion and specific pointers to help me getting started.

Thank you,
Antonio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20150311/758b6d89/attachment.html>


More information about the SciPy-User mailing list