[SciPy-Dev] Limit number of iterations in sigma clipping

Ralf Gommers ralf.gommers at gmail.com
Sun Oct 18 11:23:26 EDT 2015


On Sun, Oct 18, 2015 at 1:18 PM, Sri Krishna <kitchi.srikrishna at gmail.com>
wrote:

> On 18 October 2015 at 15:05, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>>
>>
>>
>> On Fri, Oct 16, 2015 at 6:21 AM, Sri Krishna <kitchi.srikrishna at gmail.com
>> > wrote:
>>
>>> Hi,
>>>
>>> The current sigma clipping function has no iters keyword.
>>>
>>
>> For clarity: this is stats.sigmaclip
>>
>>
>>> I feel it would be useful to include it in the function, with a default
>>> like `iters=None` so that if people do want to use the keyword they will be
>>> able to do so.
>>>
>>> With the defaults of `iters = None` it shouldn't break anyone's workflow
>>> and will correspond to the current default behaviour.
>>>
>>> If there is any interest in this, I can file a pull request soon.
>>>
>>
>> Makes sense to me to add this.
>>
>> I just saw that AstroPy has a much more extensive sigma clipping
>> function. Replacing stats.sigmaclip with
>> http://docs.astropy.org/en/latest/api/astropy.stats.sigma_clip.html
>> would be the way to go I think. In a backwards-compatible fashion of
>> course, so the `sigma` keyword from that function should be left out.
>>
>
> Well, I am working on a pull request on the astropy sigma clipping routine
> to incorporate the scipy function - The scipy function can be upto 100x
> faster than the astropy function in benchmarks that I've run, and I'm not
> convinced that it is worthwhile loosing the speed for features. So once the
> PR is merged, the astropy function will have the option to use the scipy
> function as well as the existing astropy function,
>

Ouch, 100x is a lot.


> As far as I can tell, the speed improvements in the scipy.stats.sigmaclip
> function come because we don't use masked arrays, and the sigma clipped
> data is discarded every iteration. We could potentially use another
> function instead of the mean (i.e., median or a user supplied function
> analogous to astropy) without too much of a speed hit.
>
> The only three features we need to port from the astropy function are the
> `cenfunc` and `varfunc` keywords to define a custom function to calculate
> mean/standard deviation and the `axis` keyword. I'm not sure we need to use
> masked arrays because of the previously mentioned performance implications.
>
> So should I work on a pull request to adapt the astropy function into
> scipy in a backwards compatible manner?
>

All the keyword you mentioned make sense to add I think, so yes please.

And a bit of bikeshedding: the names used by AstroPy aren't optimal imho.
Maybe:
- iters --> numiter
- cenfunc --> centerfunc

Ralf


> Thanks,
> Krishna
>
>
> _______________________________________________
> SciPy-Dev mailing list
> SciPy-Dev at scipy.org
> https://mail.scipy.org/mailman/listinfo/scipy-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scipy-dev/attachments/20151018/324db1e5/attachment.html>


More information about the SciPy-Dev mailing list