[SciPy-Dev] Limit number of iterations in sigma clipping

Sri Krishna kitchi.srikrishna at gmail.com
Sun Oct 18 07:18:55 EDT 2015


On 18 October 2015 at 15:05, Ralf Gommers <ralf.gommers at gmail.com> wrote:

>
>
>
> On Fri, Oct 16, 2015 at 6:21 AM, Sri Krishna <kitchi.srikrishna at gmail.com>
> wrote:
>
>> Hi,
>>
>> The current sigma clipping function has no iters keyword.
>>
>
> For clarity: this is stats.sigmaclip
>
>
>> I feel it would be useful to include it in the function, with a default
>> like `iters=None` so that if people do want to use the keyword they will be
>> able to do so.
>>
>> With the defaults of `iters = None` it shouldn't break anyone's workflow
>> and will correspond to the current default behaviour.
>>
>> If there is any interest in this, I can file a pull request soon.
>>
>
> Makes sense to me to add this.
>
> I just saw that AstroPy has a much more extensive sigma clipping function.
> Replacing stats.sigmaclip with
> http://docs.astropy.org/en/latest/api/astropy.stats.sigma_clip.html would
> be the way to go I think. In a backwards-compatible fashion of course, so
> the `sigma` keyword from that function should be left out.
>

Well, I am working on a pull request on the astropy sigma clipping routine
to incorporate the scipy function - The scipy function can be upto 100x
faster than the astropy function in benchmarks that I've run, and I'm not
convinced that it is worthwhile loosing the speed for features. So once the
PR is merged, the astropy function will have the option to use the scipy
function as well as the existing astropy function,

As far as I can tell, the speed improvements in the scipy.stats.sigmaclip
function come because we don't use masked arrays, and the sigma clipped
data is discarded every iteration. We could potentially use another
function instead of the mean (i.e., median or a user supplied function
analogous to astropy) without too much of a speed hit.

The only three features we need to port from the astropy function are the
`cenfunc` and `varfunc` keywords to define a custom function to calculate
mean/standard deviation and the `axis` keyword. I'm not sure we need to use
masked arrays because of the previously mentioned performance implications.

So should I work on a pull request to adapt the astropy function into scipy
in a backwards compatible manner?

Thanks,
Krishna
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20151018/73955394/attachment.html>


More information about the SciPy-Dev mailing list