[SciPy-Dev] On deprecating `stats.threshold`

josef.pktd at gmail.com josef.pktd at gmail.com
Thu Jun 18 08:27:50 EDT 2015


On Thu, Jun 18, 2015 at 6:16 AM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> On Wed, Jun 17, 2015 at 10:44 PM, Abraham Escalante <aeklant at gmail.com>
> wrote:
> > Hello all,
> >
> > As part of the ongoing scipy.stats improvements we are pondering the
> > deprecation of `stats.threshold` (and its masked array counterpart:
> > `mstats.threshold`) for the following reasons.
> >
> > The functionality it provides is nearly identical to `np.clip`.
> > Its usage does not seem to be common (Ralf made a search with
> searchcode; it
> > is not used in scipy as a helper function either).
>
> I don't think those are sufficient reasons for deprecation.
> It does fullfil a purpose as its not exactly the same np.clip, the
> implementation is simple and maintainable and its documented well.
> There has to be something bad or dangerous about the function to
> warrant issuing warnings on usage.
>


I pretty much share the view of David, It has interesting use cases but
it's not worth it.

The use case I was thinking of is to calculate trimmed statistics with nan
aware functions.
Similar to David's example, we can set outliers, points beyond the
threshold to nan, and then use nanmean and nanstd to calculate the trimmed
statistics.

Trimming is dropping the outliers, while np.clip is "winsorizing" the
outliers, i.e. shrink them to the thressholds.
For this np.clip is not a replacement for stats.threshold.

However:

My guess is that this was used as a helper function for the trimmed
statistics in scipy.stats but lost it's use during some refactoring.

As a public function it would belong to numpy.  I didn't remember
stats.threshold, and it's easy to "inline" by masked indexing. I don't
think users would think about looking for it in scipy.stats (as indicated
by the missing use according to Ralf's search).
Even if I'd remember the threshold function, I wouldn't use it because then
I need to import scipy.stats and large parts of scipy (which is slow in
cold start) just for a one liner.

Josef





> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20150618/7a636022/attachment.html>


More information about the SciPy-Dev mailing list