[SciPy-User] Suggest moving curve_fit from minpack to top level

josef.pktd at gmail.com josef.pktd at gmail.com
Wed Feb 24 11:14:32 EST 2010


On Wed, Feb 24, 2010 at 10:34 AM, Robert Kern <robert.kern at gmail.com> wrote:
> On Wed, Feb 24, 2010 at 08:48,  <josef.pktd at gmail.com> wrote:
>
>> curve_fit doesn't really fit into scipy.optimize next to all the low
>> level optimizers.
>>
>> Given the recent discussion, I think it would fit better in
>> scipy.interpolate which could be extended to interpolate and fitting
>> to noisy data, e.g. rbf, least squares splines.
>
> Heh. This is precisely why scipy.odr is in scipy.odr. When I asked
> where it should go, I got three responses and four opinions.
> scipy.optimize and scipy.stats are both plausible places for these
> routines; scipy.interpolate is plausible if you squint, but is not
> really in my opinion. I would like to keep scipy.interpolate for real
> interpolation methods. If those interpolation methods also have a
> closely related mode for fitting noisy data, so be it, but we
> shouldn't expand the scope to include all fitting methods that aren't
> interpolation. Similarly, curve_fit() and leastsq() are really just
> small wrappers around the underlying minimizers that do the heavy
> lifting, so they ended up in scipy.optimize.
>
> It might be time for a new top-level package scipy.fitting that brings
> together all of the noisy fitting solutions and provides a nice place
> for new code like Jonathan Stickel's. However, I would only want to do
> that if we were providing something else along with it, like unifying
> the interfaces or providing generic cross-validation routines that
> would work with any fitter that had the right interface, etc. I think
> that moving things around just to have a better organization is a
> waste of time and is not justified by the costs of deprecating working
> code and invalidating umpteen docs, tutorials, archived emails, blog
> posts, Stack Overflow answers that are floating out there, being all
> Googleable as outdated documentation is wont to be.

I mainly brought moving curve_fit up now, because it hasn't been
included in any official release yet. I don't really care where the
category is located, but it would be good to have a place where this
can be expanded.

The main reason I thought of scipy.interpolate is that smoothers are
also mainly focused on fitting points in the interval of the sample
points. For general fitting and estimation there will be some overlap
with statsmodels, e.g. eventually we will get a statsmodels version of
curve_fit, but with a different focus.
Having some generic cross-validation methods available would be very
useful, e.g. also for bandwith selection in kde.

For a full new subpackage, it might be better to go through a scikits
first, but it raises the barrier and work quite a bit. Since I'm not
doing much in the "smoother" area, I don't know how large this would
become, given that there are already noisy interpolators and the
filters in signal and ndimage.

(In statsmodels, we haven't started on non-parametric methods, so I
haven't looked into cross-validation yet, and also not for other model
selection.)

>
> Ultimately, I think that the organization of functions into packages
> is much less important than having quality documentation that answers
> users' "How do I ...?" questions. I know it's much easier to move a
> function into a new package than it is to write good docs, but it
> doesn't solve the real problem.

I fully agree with this

Josef

>
> --
> Robert Kern
>
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>  -- Umberto Eco
> _______________________________________________
> SciPy-User mailing list
> SciPy-User at scipy.org
> http://mail.scipy.org/mailman/listinfo/scipy-user
>



More information about the SciPy-User mailing list