[SciPy-User] handling outliers

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Nov 18 21:23:52 EST 2010


If you can estimate the systematic part, especially as a function that
is linear in parameters, then the robust estimators in
scikits.statsmodels could help. We have some residual diagnostic
measures, but some of the usual outlier statistics haven't been added
yet (they are still on the wish list).

Also, I have written a generic maximum likelihood estimator that
assumes t-distributed noise and because the t-distribution is a
fat-tailed distribution, it is also robust to outliers. This shouldn't
be to difficult to adjust to non-linear models.

If you want to fit a time series model, then I don't know of any
outlier robust estimation implementation yet.

It will depend on how easy it is in your case to separate noise from the signal.

Josef

On 11/18/10, Предеин П. А. <crmpeter at gmail.com> wrote:
> I have such question: what is the best way to remove outliers from
> array (from timeseries, for example)? I consider using
> scikits.timeseries library, and saw it's anom() function. But I have
> mean and deviation values being changing drammatically in my serie
> (like sin but also growing y-level).
> Applying "mean()+/-3*std()" will cut some useful points and leave
> outliers somewhere.
> Please, help.
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list